AI safety & alignment
Ensure increasingly capable AI systems remain corrigible, honest, and aligned with human values.
The summary lives here. The full whitepaper walks through the four-axis ranking, existing alternatives, proposed direction, cost & scale, and suggested investors — in the spirit of Hyperloop Alpha.
Severity · WTP / wealth
share of affected person’s wealth they would pay for a solution
Current solutions
quality of existing solutions — low score = high opportunity
Market size · TAM
USD / year the world is already paying
Time · OOM to impact
order-of-magnitude horizon to civilizational-scale impact
Capital · OOM to solve
cumulative R&D + deployment + supply chain across the arc
Three-lens scoring
As AI systems approach and exceed human-level capability across domains, the open problem is whether their goals and behaviors remain under human correction. Unaligned AI is the one x-risk that is accelerating rather than slowing. A Deutschian framing: safety is not a brake on progress, it is an engineering achievement of progress. The work is technical (interpretability, evals, corrigibility) and institutional (governance, deployment protocols).
The success vision · 15 years horizon
If we solve this, here is the world we get.
Before · today
Frontier AI training proceeds with limited interpretability of model internals, no proven scalable alignment method, and minimal regulatory verification capacity.
After · 15 years
Aligned, corrigible frontier AI is the default deployment pattern. Interpretability tools verify models share human-relevant values before deployment. Capability gains do not increase x-risk.
Voices on this quest
4 thinkers“All evils are caused by insufficient knowledge. Problems are inevitable. Problems are soluble.”
AI safety is a knowledge problem, not a limit problem. Aligned AGI is achievable through better explanations, not through halting development.
Co-founded OpenAI originally because of concerns about unaligned AI. Has continued to treat AI alignment as an existential priority.
OpenAI founding announcement (2015); subsequent public statements
Emergent Ventures has funded AI-safety projects and unconventional alignment researchers under the "fast grants" model.
AGI is named on the good-quest list. Hard-tech builders, not only researchers, need to be at the center of the safety conversation.
Companies on this quest
6 mappedOriginally nonprofit research lab, now capped-profit. Safety and superalignment teams alongside capabilities work.
AI safety company building Claude. Constitutional AI, mechanistic interpretability, and frontier-scale alignment research.
Alignment-focused AI lab. Runs Conjecture Institute for critical rationalist research on AI safety.
Applied alignment research, adversarial evaluation, AI control, and mechanistic interpretability.
Third-party evaluation of frontier AI models for dangerous capabilities. Pre-deployment testing protocols.
Mechanistic interpretability as a product, tools for editing model internals rather than just observing.
Capital funding this quest
4 allocatorsEmergent Ventures
GrantFast grants. High-variance, unconventional, talent-first.
best for: Unproven people with a weird, specific idea and no credential path.
Thiel Fellowship
Fellowship$100k to stop out of school and build something important.
best for: Under-22 builders with spiky talent and a real project.
Founders Fund
Venture CapitalContrarian hard tech that rebuilds the industrial base.
best for: Scaling massive hard-tech quests, space, defense, biotech, fusion.
Lux Capital
Venture CapitalCounter-conventional science at the edges of physics and biology.
best for: Technical founders at the frontier who need contrarian conviction capital.