Theory of Change
MIRI's theory of change has undergone a fundamental shift. Until 2022, their theory was: "Solve the technical alignment problem before AGI arrives." That failed by their own admission. Nate Soares wrote in Dec 2020 that their research push "has, at this point, largely failed."
Their current theory (since 2023) is: "Convince governments to impose an international moratorium on frontier AI development until alignment is solved." From their 2024 comms strategy: "Our objective is to convince major powers to shut down the development of frontier AI systems worldwide before it is too late. We believe that nothing less than this will prevent future misaligned smarter-than-human AI systems from destroying humanity."
The mechanism is: (1) MIRI produces compelling communications (book, media appearances, online resources) that reach policymakers and the public, (2) this shifts public opinion to take AI extinction risk seriously, (3) policymakers enact international compute governance and eventually a moratorium, (4) the moratorium buys time for alignment research (possibly over decades or generations).
MIRI's research leadership estimates extinction probability at "upward of 90%" absent aggressive policy intervention.
What They Do
MIRI currently operates two main teams:
Communications (~7 FTEs including Nate Soares and Eliezer Yudkowsky): Published "If Anyone Builds It, Everyone Dies" (Sep 2025, NYT #7 bestseller). Major media appearances including Ezra Klein, Sam Harris, NPR, ABC News, Hank Green. Produced "The Problem" explainer for MIRI's website. Supporting "The AI Doc" documentary (Mar 2026). Plans to grow the team and shift from content production to third-party support.
Technical Governance Team (4+ researchers): Published an AI governance research agenda on arXiv (May 2025). Drafted an international agreement to prevent ASI development. Participated in EU AI Act Code of Practice working groups. Provided congressional testimony (Senate AI Insight Forum Dec 2023, Canadian House of Commons 2025). Aaron Scher spoke on UN AI verification panel.
Residual research: A "small amount of in-house technical alignment research" continues. Sam Eisenstat and Benya Fallenstein remain as researchers. Three separate research budgets exist for Eliezer, Nate, and Malo.
Historical research output (now largely abandoned): Agent Foundations technical agenda (2014), Logical Induction (2016), Corrigibility (2015), Embedded Agency (2018), Late 2021 MIRI Conversations with external researchers.
Key People
Eliezer Yudkowsky -- Co-founder and Board Chair. Self-educated. Founded the org at 20. Author of the Sequences (LessWrong), Harry Potter and the Methods of Rationality, and co-author of the MIRI book. Widely recognized as the founder of AI alignment as a field. Extremely high-profile, extremely controversial. His personal credibility is inseparable from MIRI's institutional credibility -- and is under sustained attack (see What Others Say).
Nate Soares -- President. Former Executive Director (~2015-2023). More measured than Yudkowsky in policy contexts. Co-author of the book. Acknowledged the research failure in 2020. In the Carnegie Endowment podcast, he articulates the case for international coordination in language accessible to foreign policy audiences.
Malo Bourgon -- CEO since June 2023. Longest-standing team member after Yudkowsky (since 2012). Engineering background. Formerly COO. Now represents MIRI at Senate hearings, international conferences. The CEO transition formalized his existing operational role.
Notable departures: Tsvi Benson-Tilsen (7 years at MIRI, now co-founder of Berkeley Genomics Project) says alignment is "0% solved." Luke Muehlhauser (former ED, now at Open Philanthropy). Lisa Thiergart (founded MIRI's governance team Feb 2024, left to IST as Senior Director after ~1 year, publicly dissented from the research de-prioritization).
~25 staff total. Board: Yudkowsky (Chair), Soares, Edwin Evans, Anna Salamon (CFAR president), Blake Borgeson (Recursion co-founder). No independent outside directors.
Money and Incentives
Revenue and runway: ~$16M reserves at end of 2024. Spending ~$7.1M/year. Approximately 2 years of runway. 2025 fundraiser raised $3.2M ($1.6M donations + $1.6M SFF match) against a $6M target.
Revenue history: 2021 was an anomaly -- $25.6M from crypto donations ($15.6M MKR tokens + $4.4M ETH from Vitalik Buterin). Revenue in 2022 dropped to $1.9M against $5.3M expenses. MIRI has been burning reserves since 2022.
Funding sources: Individual donors and Survival and Flourishing Fund (SFF, backed by Jaan Tallinn). No Open Philanthropy funding since Feb 2020 ($14.76M total across 5 grants, 2016-2020). Open Phil funded them despite "strong reservations" about their research approach; the later grants were routed through a general EA support committee, not on technical merits. The book is not a net income source ("costs to produce and promote the book have far exceeded any income").
2026 projected budget: $8.1M. Operations: $2.6M. Outreach/Comms: $3.2M (largest). Research: $2.3M (smallest).
Business model: Donations. No product revenue, no contracts, no compute credits, no lab partnerships.
Economic independence from labs: MIRI has zero financial ties to AI labs -- no funding, no compute, no corporate partnerships. This is genuine structural independence, rare among AI safety orgs.
Key financial risk: The 2025 fundraiser fell $2.8M short of target, and MIRI is uncertain how many past donors will support the new comms/policy strategy. If funding doesn't materialize, they would need to "drastically change our plans and scale back our ambitions."
Top compensation: $536,744 (2022 990 filing), $448,062 (2021). High for a 25-person nonprofit but not unusual for Bay Area.
What Others Say
The case against MIRI's theory of change (strongest version):
Bentham's Bulldog documents Yudkowsky's track record: predicted his team would build superintelligence by 2008-2010, predicted nanotech would kill everyone by 2010, dismissed connectionism/deep learning, and shows "every single time he talks about a topic that I know anything about, what he says is completely unreasonable." The pattern is extreme confidence paired with demonstrated error across domains.
The Asterisk review identifies the deeper problem: MIRI's conclusions haven't changed despite the fundamental shift from hand-coded AI to deep learning. "The fundamental architecture, training methods and requirements for modern AI systems are all completely different from the technology Yudkowsky imagined in 2008, yet nothing about the core MIRI story has changed." This suggests unfalsifiable reasoning.
Scott Alexander (sympathetic to AI risk) identifies fast takeoff as the load-bearing assumption that gets barely two sentences in the book: "Without some radical discontinuity, it's very hard to believe [the arguments] for the absolute certainty of AI doom or the necessary failure of our attempts to prevent it."
Lawfare identifies three unsettled empirical questions MIRI's doom case depends on: How hard is alignment? Would misaligned ASI actually succeed? What happens before ASI? None are resolved.
Paul Christiano: "Eliezer is unreasonably pessimistic about interpretability while being mostly ignorant about the current state of the field."
The case for MIRI:
Scott Alexander: "MIRI answered: moral clarity." In a field where most actors hedge and optimize for palatability, MIRI says what it believes regardless of consequences.
Tsvi Benson-Tilsen (7 years at MIRI): alignment is "0% solved." If this is roughly right, then MIRI's extreme position is the correct one and the incrementalists are dangerously wrong.
The book reached millions. If the goal is to move the policy conversation, NYT bestseller status, congressional testimony, and endorsements from national security figures are real accomplishments.
MIRI's independence from labs means they have no incentive to downplay risks.
Cultural concerns:
The MIRI-CFAR ecosystem has been linked to psychotic episodes, cult dynamics, and the Zizian murder cult (though the connection is indirect). Anna Salamon (MIRI board member, CFAR president): "We didn't know at the time, but in hindsight we were creating conditions for a cult." Bloomberg documented multiple cases. MIRI has not published any institutional statement addressing these harms.
What's Absent
No public explanation from either MIRI or Open Phil for why funding stopped after 2020.
No concrete policy outcomes from the 2-year-old comms/policy pivot -- no legislation, no moratorium proposals gaining traction. Success is measured in "conversations" and "awareness."
No external evaluation of MIRI's 10+ years of research. Work was nondisclosed-by-default and the only external review mentioned is an anonymous ML researcher.
No independent board members. All five directors are insiders or close associates.
No explicit falsification conditions -- what evidence would cause MIRI to update away from extreme pessimism?
No succession plan for an org where two individuals drive credibility, fundraising, and media presence.
No formal retrospective on what $40-50M in research spending (2013-2022) actually produced.
Recommended Reading
- Dwarkesh Patel 4-hour interview with Yudkowsky (Apr 2023) -- The most candid, comprehensive source on Yudkowsky's thinking. Rich enough to form your own view. https://www.dwarkeshpatel.com/p/eliezer-yudkowsky
- Bentham's Bulldog: "Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong" -- The strongest independent critique. Documents the track record you need to evaluate the man's credibility. https://benthams.substack.com/p/eliezer-yudkowsky-is-frequently-confidently
- Scott Alexander's book review (ACX) -- Identifies fast takeoff as the crux, engages seriously with both sides. The most balanced single assessment. https://www.astralcodexten.com/p/book-review-if-anyone-builds-it-everyone
- MIRI 2024 Mission and Strategy Update -- Their most honest self-assessment of the research failure and strategic pivot. https://intelligence.org/2024/01/04/miri-2024-mission-and-strategy-update/
- Tsvi Benson-Tilsen: "Why AI Alignment Is 0% Solved" -- 7-year MIRI insider's assessment. https://lironshapira.substack.com/p/ai-alignment-is-0-solved-tsvi-benson-tilsen