all problems
x-risk frontierAI & x-riskhumans affected:high

AI safety & alignment

Ensure increasingly capable AI systems remain corrigible, honest, and aligned with human values.

Whitepaper · v0.1 · open to refutation

The summary lives here. The full whitepaper walks through the four-axis ranking, existing alternatives, proposed direction, cost & scale, and suggested investors — in the spirit of Hyperloop Alpha.

Quantity · humans affected

8.1Bhumans

source: 80,000 Hours AI problem profile

Severity · WTP / wealth

100%low

share of affected person’s wealth they would pay for a solution

Current solutions

1.5/ 10low

quality of existing solutions — low score = high opportunity

Market size · TAM

$20.0Blow

USD / year the world is already paying

Time · OOM to impact

15ylow

order-of-magnitude horizon to civilizational-scale impact

Capital · OOM to solve

$200.0Blow

cumulative R&D + deployment + supply chain across the arc

Three-lens scoring

welfare · copenhagen BCRn/a
x-risk · 80k hours ITN
9.5 / 10high
utility delta · state-of-art vs physics
90%low

As AI systems approach and exceed human-level capability across domains, the open problem is whether their goals and behaviors remain under human correction. Unaligned AI is the one x-risk that is accelerating rather than slowing. A Deutschian framing: safety is not a brake on progress, it is an engineering achievement of progress. The work is technical (interpretability, evals, corrigibility) and institutional (governance, deployment protocols).

The success vision · 15 years horizon

If we solve this, here is the world we get.

low

Before · today

Frontier AI training proceeds with limited interpretability of model internals, no proven scalable alignment method, and minimal regulatory verification capacity.

After · 15 years

Aligned, corrigible frontier AI is the default deployment pattern. Interpretability tools verify models share human-relevant values before deployment. Capability gains do not increase x-risk.

Voices on this quest

4 thinkers
David DeutschPhysicist & Philosopher · Oxford
All evils are caused by insufficient knowledge. Problems are inevitable. Problems are soluble.

AI safety is a knowledge problem, not a limit problem. Aligned AGI is achievable through better explanations, not through halting development.

The Beginning of Infinity, chapter 1

Elon MuskEngineer & Founder · SpaceX, Tesla, Neuralink, xAI

Co-founded OpenAI originally because of concerns about unaligned AI. Has continued to treat AI alignment as an existential priority.

OpenAI founding announcement (2015); subsequent public statements

Tyler CowenEconomist & Writer · George Mason, Emergent Ventures

Emergent Ventures has funded AI-safety projects and unconventional alignment researchers under the "fast grants" model.

Emergent Ventures grant cohorts

Trae StephensPartner & Co-founder · Founders Fund, Anduril

AGI is named on the good-quest list. Hard-tech builders, not only researchers, need to be at the center of the safety conversation.

Choose Good Quests

Companies on this quest

6 mapped
OpenAIprivate

Originally nonprofit research lab, now capped-profit. Safety and superalignment teams alongside capabilities work.

$157.0Bmed
Anthropicprivate

AI safety company building Claude. Constitutional AI, mechanistic interpretability, and frontier-scale alignment research.

$61.5Bhigh
Conjectureprivate

Alignment-focused AI lab. Runs Conjecture Institute for critical rationalist research on AI safety.

private · no disclosed cap

Applied alignment research, adversarial evaluation, AI control, and mechanistic interpretability.

private · no disclosed cap

Third-party evaluation of frontier AI models for dangerous capabilities. Pre-deployment testing protocols.

private · no disclosed cap
Goodfireprivate

Mechanistic interpretability as a product, tools for editing model internals rather than just observing.

private · no disclosed cap

Capital funding this quest

4 allocators

Sources