AI Safety Orgs
Briefs and theory-of-change analyses for 120 organizations across the AI safety ecosystem — frontier labs, funders, research orgs, policy shops, advocacy groups, and field-building programs.
Briefs are factual; analyses are opinionated and sit behind a toggle on each org’s page. Both are written by Claude from public sources.
120 orgs
Funders(12)
THE funder. $480M+. Shapes the entire ecosystem.
Advocacy + funding hybrid. Musk-founded.
Giving intermediary.
Meta-charity evaluation. Context for EA funding philosophy.
Was #2 funder. Collapsed Nov 2022. Ecosystem fragility.
#2 active funder. Jaan Tallinn. Different philosophy from CG.
Non-EA, nanotech origins. Different intellectual tradition.
Donor advisory. Routes major gifts.
EA Fund. High-volume smaller grants.
Non-EA funder. Different source.
Regranting model. Decentralized funding.
EA Fund. Meta/community infrastructure.
Frontier Labs(8)
Origin story, mission drift, safety exodus.
Safety-first lab. RSP. Constitutional AI. The benchmark.
LeCun's safety skepticism. Open weights.
Musk. Move-fast approach.
Hassabis/Legg. Frontier Safety Framework.
European challenger. EU AI Act tension.
Ilya's safety-first ASI bet.
Chinese frontier. Efficiency innovations.
Technical Alignment Research(14)
Foundational. Agent foundations. The original alignment org.
Adversarial robustness. Residency program.
Stuart Russell. CIRL. Academic alignment.
Adversarial training → AI control agenda.
Hadfield-Menell. CIRL. Academic.
S-risks, suffering focus. Different from x-risk.
Cognitive emulation. Controversial.
Paul Christiano. Evals + alignment theory.
CG-funded safety research.
Newer safety startup. LLM steering.
Values alignment through meaning. Philosophical.
Sam Bowman. Scalable oversight.
David Krueger. Goal misgeneralization.
Agent foundations, formal verification.
Interpretability(3)
Evals & Red-Teaming(7)
Government evaluator. Pre-deployment testing.
Government evaluator. Political uncertainty.
Scheming detection. Deception evals.
ARA evals. Beth Barnes. Gold standard.
Red-teaming, jailbreaking research.
Automated red-teaming. YC-backed.
Commercial safety evals.
Governance & Policy(22)
Defense think tank. Compute governance.
UK gov R&D agency. Safeguarded AI.
EU AI Act implementation. Binding law.
Georgetown. US-China AI. Helen Toner.
AI treaties. International governance.
National security framing. $14M CG.
Industry consortium.
Legal scholarship for x-risk.
UK AI ethics. Nuffield-funded.
US AI governance. Practical policy.
UK policy. Bletchley Declaration.
Oxford. Talent pipeline. Compute governance.
AI + cybersecurity. Berkeley.
Compute licensing proposals.
Chinese-international bridge. Newsletter.
US federal advocacy. Direct lobbying.
International/multilateral governance.
AI security + safety intersection.
International governance scorecards.
Legal frameworks for AI risk.
National security AI risk.
Chinese government approach.
Advocacy(9)
US policy lobbying. SB 1047.
Mainstream advocacy. Netflix doc.
European AI safety advocacy.
European x-risk public communication.
Public opinion → legislation. Polling.
Youth-led. Non-EA framing.
Temporary pause advocacy. Grassroots.
Full halt position. Most aggressive.
Lab accountability. OpenAI Files.
Field-Building(19)
EA umbrella (80K, CEA, GWWC).
Career gateway. Talent bottleneck thesis.
Premier alignment training program.
LessWrong. Alignment Forum.
AI safety courses. Scale.
Privacy-preserving AI. $17M CG.
Berkeley research community hub.
Operational support for researchers.
Incubates charities.
10% Pledge. EA giving.
Youth talent. $8.8M CG.
Rationality → safety pipeline. Historical.
Academic research support.
Researcher wellbeing.
Community hackathons. Low-barrier.
Research sprints. London.
Interdisciplinary. Biology + AI safety.
European volunteer research. Low-barrier.
Alignment engineer bootcamp. Open-source.
Forecasting & Epistemic Infrastructure(6)
Tetlock. Superforecasting for AI risk.
Community forecasting platform.
Compute trends. AI timelines.
Prediction markets.
Katja Grace. Expert surveys.
Squiggle. Uncertainty quantification.
Research Institutions & Cross-Cutting(17)
Bengio. Academic → safety advocate.
Allen Institute. Open-source academic AI.
Commercial studio funding alignment.
Cause prioritization research.
Human-centered AI framing.
Open-source AI research community.
Hendrycks. Statement on AI Risk. Research.
CLOSED. Bostrom. Spawned the field.
Cambridge. Interdisciplinary futures.
Hendrycks safety startup. CAIS cross-ref.
Research → product pivot (Elicit).
Present harms. AI ethics. Different framing.
Multi-agent cooperation.
Cambridge x-risk. Interdisciplinary.
Stuart Armstrong. FHI spinoff.
AI honesty. TruthfulQA benchmark.
Cross-risk research.