Machine Intelligence Research Institute (MIRI)

Conceptual Research

Foundational. Agent foundations. The original alignment org.

Founded: 2000
HQ: Berkeley, CA
Team: 25
Structure: 501(c)(3) nonprofit
Model: Donations

Theory of Change

MIRI's theory of change has undergone a fundamental shift. Until 2022, their theory was: "Solve the technical alignment problem before AGI arrives." That failed by their own admission. Nate Soares wrote in Dec 2020 that their research push "has, at this point, largely failed."

Their current theory (since 2023) is: "Convince governments to impose an international moratorium on frontier AI development until alignment is solved." From their 2024 comms strategy: "Our objective is to convince major powers to shut down the development of frontier AI systems worldwide before it is too late. We believe that nothing less than this will prevent future misaligned smarter-than-human AI systems from destroying humanity."

The mechanism is: (1) MIRI produces compelling communications (book, media appearances, online resources) that reach policymakers and the public, (2) this shifts public opinion to take AI extinction risk seriously, (3) policymakers enact international compute governance and eventually a moratorium, (4) the moratorium buys time for alignment research (possibly over decades or generations).

MIRI's research leadership estimates extinction probability at "upward of 90%" absent aggressive policy intervention.

What They Do

MIRI currently operates two main teams:

Communications (~7 FTEs including Nate Soares and Eliezer Yudkowsky): Published "If Anyone Builds It, Everyone Dies" (Sep 2025, NYT #7 bestseller). Major media appearances including Ezra Klein, Sam Harris, NPR, ABC News, Hank Green. Produced "The Problem" explainer for MIRI's website. Supporting "The AI Doc" documentary (Mar 2026). Plans to grow the team and shift from content production to third-party support.

Technical Governance Team (4+ researchers): Published an AI governance research agenda on arXiv (May 2025). Drafted an international agreement to prevent ASI development. Participated in EU AI Act Code of Practice working groups. Provided congressional testimony (Senate AI Insight Forum Dec 2023, Canadian House of Commons 2025). Aaron Scher spoke on UN AI verification panel.

Residual research: A "small amount of in-house technical alignment research" continues. Sam Eisenstat and Benya Fallenstein remain as researchers. Three separate research budgets exist for Eliezer, Nate, and Malo.

Historical research output (now largely abandoned): Agent Foundations technical agenda (2014), Logical Induction (2016), Corrigibility (2015), Embedded Agency (2018), Late 2021 MIRI Conversations with external researchers.

Key People

Eliezer Yudkowsky -- Co-founder and Board Chair. Self-educated. Founded the org at 20. Author of the Sequences (LessWrong), Harry Potter and the Methods of Rationality, and co-author of the MIRI book. Widely recognized as the founder of AI alignment as a field. Extremely high-profile, extremely controversial. His personal credibility is inseparable from MIRI's institutional credibility -- and is under sustained attack (see What Others Say).

Nate Soares -- President. Former Executive Director (~2015-2023). More measured than Yudkowsky in policy contexts. Co-author of the book. Acknowledged the research failure in 2020. In the Carnegie Endowment podcast, he articulates the case for international coordination in language accessible to foreign policy audiences.

Malo Bourgon -- CEO since June 2023. Longest-standing team member after Yudkowsky (since 2012). Engineering background. Formerly COO. Now represents MIRI at Senate hearings, international conferences. The CEO transition formalized his existing operational role.

Notable departures: Tsvi Benson-Tilsen (7 years at MIRI, now co-founder of Berkeley Genomics Project) says alignment is "0% solved." Luke Muehlhauser (former ED, now at Open Philanthropy). Lisa Thiergart (founded MIRI's governance team Feb 2024, left to IST as Senior Director after ~1 year, publicly dissented from the research de-prioritization).

~25 staff total. Board: Yudkowsky (Chair), Soares, Edwin Evans, Anna Salamon (CFAR president), Blake Borgeson (Recursion co-founder). No independent outside directors.

Money and Incentives

Revenue and runway: ~$16M reserves at end of 2024. Spending ~$7.1M/year. Approximately 2 years of runway. 2025 fundraiser raised $3.2M ($1.6M donations + $1.6M SFF match) against a $6M target.

Revenue history: 2021 was an anomaly -- $25.6M from crypto donations ($15.6M MKR tokens + $4.4M ETH from Vitalik Buterin). Revenue in 2022 dropped to $1.9M against $5.3M expenses. MIRI has been burning reserves since 2022.

Funding sources: Individual donors and Survival and Flourishing Fund (SFF, backed by Jaan Tallinn). No Open Philanthropy funding since Feb 2020 ($14.76M total across 5 grants, 2016-2020). Open Phil funded them despite "strong reservations" about their research approach; the later grants were routed through a general EA support committee, not on technical merits. The book is not a net income source ("costs to produce and promote the book have far exceeded any income").

2026 projected budget: $8.1M. Operations: $2.6M. Outreach/Comms: $3.2M (largest). Research: $2.3M (smallest).

Business model: Donations. No product revenue, no contracts, no compute credits, no lab partnerships.

Economic independence from labs: MIRI has zero financial ties to AI labs -- no funding, no compute, no corporate partnerships. This is genuine structural independence, rare among AI safety orgs.

Key financial risk: The 2025 fundraiser fell $2.8M short of target, and MIRI is uncertain how many past donors will support the new comms/policy strategy. If funding doesn't materialize, they would need to "drastically change our plans and scale back our ambitions."

Top compensation: $536,744 (2022 990 filing), $448,062 (2021). High for a 25-person nonprofit but not unusual for Bay Area.

What Others Say

The case against MIRI's theory of change (strongest version):

Bentham's Bulldog documents Yudkowsky's track record: predicted his team would build superintelligence by 2008-2010, predicted nanotech would kill everyone by 2010, dismissed connectionism/deep learning, and shows "every single time he talks about a topic that I know anything about, what he says is completely unreasonable." The pattern is extreme confidence paired with demonstrated error across domains.

The Asterisk review identifies the deeper problem: MIRI's conclusions haven't changed despite the fundamental shift from hand-coded AI to deep learning. "The fundamental architecture, training methods and requirements for modern AI systems are all completely different from the technology Yudkowsky imagined in 2008, yet nothing about the core MIRI story has changed." This suggests unfalsifiable reasoning.

Scott Alexander (sympathetic to AI risk) identifies fast takeoff as the load-bearing assumption that gets barely two sentences in the book: "Without some radical discontinuity, it's very hard to believe [the arguments] for the absolute certainty of AI doom or the necessary failure of our attempts to prevent it."

Lawfare identifies three unsettled empirical questions MIRI's doom case depends on: How hard is alignment? Would misaligned ASI actually succeed? What happens before ASI? None are resolved.

Paul Christiano: "Eliezer is unreasonably pessimistic about interpretability while being mostly ignorant about the current state of the field."

The case for MIRI:

Scott Alexander: "MIRI answered: moral clarity." In a field where most actors hedge and optimize for palatability, MIRI says what it believes regardless of consequences.

Tsvi Benson-Tilsen (7 years at MIRI): alignment is "0% solved." If this is roughly right, then MIRI's extreme position is the correct one and the incrementalists are dangerously wrong.

The book reached millions. If the goal is to move the policy conversation, NYT bestseller status, congressional testimony, and endorsements from national security figures are real accomplishments.

MIRI's independence from labs means they have no incentive to downplay risks.

Cultural concerns:

The MIRI-CFAR ecosystem has been linked to psychotic episodes, cult dynamics, and the Zizian murder cult (though the connection is indirect). Anna Salamon (MIRI board member, CFAR president): "We didn't know at the time, but in hindsight we were creating conditions for a cult." Bloomberg documented multiple cases. MIRI has not published any institutional statement addressing these harms.

What's Absent

No public explanation from either MIRI or Open Phil for why funding stopped after 2020.

No concrete policy outcomes from the 2-year-old comms/policy pivot -- no legislation, no moratorium proposals gaining traction. Success is measured in "conversations" and "awareness."

No external evaluation of MIRI's 10+ years of research. Work was nondisclosed-by-default and the only external review mentioned is an anonymous ML researcher.

No independent board members. All five directors are insiders or close associates.

No explicit falsification conditions -- what evidence would cause MIRI to update away from extreme pessimism?

No succession plan for an org where two individuals drive credibility, fundraising, and media presence.

No formal retrospective on what $40-50M in research spending (2013-2022) actually produced.

Stated Theory of Change

MIRI's stated theory of change has two layers:

The macro thesis: Building artificial superintelligence with current understanding will cause human extinction (>90% probability). The alignment problem is unsolvably hard on the timeline available. Therefore, the only viable path is an international moratorium on frontier AI development, lasting decades or generations, until humanity develops sufficient understanding to proceed safely.

The micro theory (how MIRI contributes): MIRI produces candid, uncompromising communications that reach policymakers and the public. This shifts the Overton window. Simultaneously, MIRI's technical governance team produces concrete policy proposals (treaty drafts, compute monitoring schemes, governance research agendas) that give policymakers actionable options. Together, these create the conditions for an eventual moratorium.

The causal chain: MIRI communications -> public awareness -> political will -> international coordination -> moratorium -> time for alignment research -> safe AI development eventually.

Revealed Theory of Change

Looking at where MIRI actually spends money and attention, several divergences emerge:

The real theory is "keep the alarm sounding until reality catches up." MIRI doesn't have a credible political pathway to a moratorium. Nate Soares acknowledged "my sense of the appetite right now isn't very high." The governance team's work is serious but there's no lobby, no political coalition, no state-level allies championing their proposals. The actual function is: keep the most extreme version of the risk argument in public discourse, so that when (if) AI systems do something alarming, there's an existing framework people can turn to.

The Yudkowsky personality engine. Whatever the org chart says, MIRI's media impact depends almost entirely on Yudkowsky's personal brand and Soares' complementary measured persona. The book, the documentary, the podcast appearances -- these are personality-driven, not institution-driven. The 2024 comms strategy explicitly acknowledges being "bottlenecked on his time, energy, and endurance."

Research as vestige. The $2.3M research budget is the smallest category, and the three-way split between Eliezer/Nate/Malo suggests no coherent research direction. The research function appears to serve institutional identity rather than any expectation of breakthroughs.

The governance team is the most promising element. The TGT's work on compute governance, treaty frameworks, and scenario analysis is the most concretely useful thing MIRI produces. It was founded by Lisa Thiergart (who then left), suggesting both its promise and the difficulty of retaining talent.

Key Assumptions

1. Alignment is unsolvably hard on current timelines.

Evidence for: MIRI's own research failed after 7+ years. Tsvi says "0% solved." The field has no consensus on a viable alignment approach for ASI.
Evidence against: Modern AI systems exhibit much more tractable alignment properties than MIRI predicted (see the Asterisk review). Interpretability is advancing. RLHF works better than expected. Paul Christiano and others think incremental progress is meaningful.
Testable? Partially -- continued AI progress without catastrophe would weaken this claim.
If wrong: MIRI's entire pivot was unnecessary and they abandoned their most valuable work.

2. Fast takeoff / intelligence explosion will occur.

Evidence for: AlphaGo Zero's rapid progress. Theoretical arguments about recursive self-improvement.
Evidence against: AI progress has been continuous and predictable, following scaling laws. AI 2027 and most alignment researchers now expect gradual, continuous improvement. Yudkowsky predicted AI would spend "almost no time" between village idiot and Einstein; current AI has been in that zone for years.
Testable? Yes -- observe whether AI progress is continuous or discontinuous.
If wrong: Most of MIRI's urgency arguments collapse. Gradual progress allows iterative alignment work.

3. International coordination for a moratorium is the most viable path.

Evidence for: Nuclear non-proliferation partially worked. Climate agreements exist. China has signaled some openness to AI governance.
Evidence against: No geopolitical precedent for halting a lucrative technology during peacetime. US administration actively hostile to AI regulation. DeepSeek showed China is competing hard. The 2025 fundraiser falling short of target suggests even MIRI's donors are skeptical.
Testable? Watch for any government to seriously propose a moratorium.
If wrong: MIRI's policy advocacy is tilting at windmills.

4. MIRI's "moral clarity" approach is more effective than incremental policy engagement.

Evidence for: The book became a NYT bestseller. MIRI claims their message reaches audiences other orgs can't.
Evidence against: No policy outcomes. The "Doomer's Dilemma" article documents the political backlash against doomers. EU rebranded "AI Safety Summit" as "AI Action Summit." US NIST told scientists to eliminate "AI safety" language. The Overton window may be closing, not opening.
Testable? Track policy outcomes over 2-3 years.
If wrong: MIRI is spending its remaining runway on communications that don't produce policy change.

Strengths

Genuine independence. Zero financial ties to AI labs. This is almost unique among AI safety orgs and means their positions are not compromised by funder incentives.

Intellectual honesty about failure. The 2020 research failure admission and the transparent strategic pivot are rare in any organization. Most would quietly rebrand rather than publicly say "our core research program failed."

Deep bench of accumulated thinking. 25 years of reasoning about alignment gives MIRI a perspective that newer orgs lack. Concepts they pioneered (corrigibility, instrumental convergence, mesa-optimization influences) are now mainstream in alignment.

Reached a mass audience. NYT bestseller, millions of podcast views, congressional testimony. If the goal is awareness, they've achieved more in 2 years of comms than in 20 years of research.

The governance team produces real work. The arXiv research agenda, treaty drafts, and international engagement are concrete contributions that other policy researchers cite as useful.

Weaknesses and Risks

Founder dependency and reputational fragility. MIRI's credibility depends overwhelmingly on Yudkowsky and Soares. Yudkowsky's more extreme statements (accepting nuclear war to prevent ASI, infanticide musings, "sex slave" posts) are well-documented and regularly surfaced by critics. If Yudkowsky's public image tips from "prophet" to "crank," MIRI's policy influence collapses.

No theory of political victory. MIRI advocates the most extreme policy position (global moratorium) without any credible pathway to its adoption. They have no political allies, no lobbying capacity, no grassroots organization. The governance team produces excellent analysis of what a moratorium could look like, but nobody is building the political coalition to enact one.

Financial sustainability crisis. $16M in reserves at $7-8M/year burn = ~2 years. The 2025 fundraiser fell $2.8M short. Open Phil stopped funding 5+ years ago. If the comms pivot doesn't attract new donors, MIRI faces existential financial pressure within 2-3 years.

Unfalsifiable reasoning. The Asterisk review's charge is devastating: MIRI's conclusions haven't changed despite the fundamental shift from hand-coded AI to deep learning. Their arguments were designed for a world of GOFAI and recursive self-improvement but are applied unchanged to a world of LLMs and scaling laws. If no evidence can change the conclusion, the reasoning isn't empirical -- it's ideological.

Alienation of the broader field. By dismissing the rest of AI safety as "alchemists" doing useless work, MIRI has burned bridges with the very people who might implement their ideas. When you need policy coalitions, having told everyone else they're delusional is a strategic liability.

Institutional culture concerns. The MIRI-CFAR ecosystem produced documented harms (psychotic episodes, cult dynamics). MIRI has not publicly addressed these. Anna Salamon, a MIRI board member, acknowledged creating "conditions for a cult." Even if MIRI itself was not directly responsible, the silence is a governance failure.

No independent governance. All five board members are insiders. For an org making sweeping claims about how the world should be governed, having no external accountability is a credibility problem.

Cross-References

Complementary to: PauseAI (shares goals but different tactics), MIRI provides intellectual underpinning that PauseAI's grassroots activism amplifies. Center for AI Safety (CAIS) shares the extinction risk framing but takes a more incremental approach.

In tension with: ARC (Paul Christiano's approach to alignment is exactly what MIRI considers futile). Anthropic (MIRI sees Anthropic as an accelerationist lab with safety branding). Redwood Research (empirical alignment work that MIRI considers too optimistic). OpenAI (MIRI sees OpenAI as the primary threat).

Contrasting model: METR/ARC Evals -- does practical evaluation work that MIRI might view as "making the best of a doomed situation" but that actually influences lab behavior today.

Gap-filling role: No other org says "stop everything" with MIRI's institutional weight. Whether this is valuable or counterproductive depends on whether you think the discourse benefits from having this extreme position represented.

What Would Change This Assessment

Upward revision if:

AI systems produce a genuinely alarming capability jump (rapid self-improvement, escape attempts that succeed) -- this would validate MIRI's fast takeoff concern.
A major AI incident (autonomous agent causing significant harm) creates political demand for MIRI's governance proposals.
MIRI's governance team's treaty framework gets adopted by any government as a starting point for negotiations.
MIRI successfully diversifies funding beyond Tallinn/SFF and individual EA donors.

Downward revision if:

AI progress continues to follow smooth scaling laws for 2+ more years with no discontinuities.
MIRI fails to secure sustainable funding and has to significantly cut operations.
The political backlash against AI doomers intensifies (already underway) and MIRI becomes a liability to the broader safety movement.
Yudkowsky's personal reputation deteriorates further due to controversial statements.

Self-Critique

What sources should I have checked but didn't? The full text of several key LessWrong posts (Death with Dignity, Paul-MIRI disagreement, Jessica Taylor's "My Experience at MIRI and CFAR") are on blocked domains. The search snippets were informative but I couldn't read them in full. The issarice timeline of MIRI would have been the most detailed chronological source.

Where is this analysis potentially biased? I may be giving too much weight to the "unfalsifiable reasoning" critique. It's possible that MIRI's core thesis is correct AND that their evidence base is weak -- these aren't mutually exclusive. A correct conclusion reached by flawed reasoning is still correct. Additionally, I may be anchoring too heavily on Yudkowsky's controversial statements when evaluating MIRI as an institution; the org is more than one person.

What would a thoughtful person who disagrees say? "You're evaluating MIRI by the standards of normal nonprofits when it's trying to do something unprecedented. Of course they don't have policy outcomes -- they're trying to prevent an extinction event that most people don't take seriously. The fact that the fundraiser fell short doesn't mean the mission is wrong, it means the world hasn't caught up yet. And your critique of their research era ignores that they were working on genuinely pre-paradigmatic problems where failure is expected."

What's my single weakest claim? That the comms/policy pivot is unlikely to produce results. It's only been 2 years. Historical precedents (nuclear non-proliferation, ozone layer, COVID response) show that paradigm-shifting policy can happen quickly once a crisis makes the case self-evident. If AI does produce a "warning shot" incident, MIRI's pre-positioned communications infrastructure could become enormously valuable.

What information would most change my view? A credible, detailed account of what MIRI's secret research actually produced (and why it failed). The nondisclosed-by-default policy means the most important part of MIRI's 10-year research era is invisible. If the work was much more substantial than the public-facing outputs suggest, the "research failure" story might be too harsh.

Connected to (7)

Anthropicstaff from · Martin Lucas

Berkeley Existential Risk Initiativestaff to · Andrew Critch

Institute for AI Policy and Strategystaff to · Lisa Thiergart

Redwood Researchcollaborator

Survival and Flourishing Fundcollaborator · Jaan Tallinn

Open Philanthropystaff to · Luke Muehlhauser

Center for Applied Rationalitycollaborator · Anna Salamon

Sources (61)

Every URL that was read during research.

1.Machine Intelligence Research Institute - Wikipediaen.wikipedia.org
2.About the Machine Intelligence Research Instituteintelligence.org
3.Research - Machine Intelligence Research Instituteintelligence.org
4.MIRI 2024 Mission and Strategy Update - Machine Intelligence Research Instituteintelligence.org
5.MIRI 2024 Communications Strategy - Machine Intelligence Research Instituteintelligence.org
6.Eliezer Yudkowsky - Wikipediaen.wikipedia.org
7.Book Review: If Anyone Builds It, Everyone Diesastralcodexten.com
8.More Was Possible: A Review of If Anyone Builds It, Everyone Diesasteriskmag.com
9.Eliezer Yudkowsky - Machine Intelligence Research Instituteintelligence.org
10.MIRI Newsletter #121 - Machine Intelligence Research Instituteintelligence.org
11.July 2024 Newsletter - Machine Intelligence Research Instituteintelligence.org
12.April 2024 Newsletter - Machine Intelligence Research Instituteintelligence.org
13.Our all-time largest donation, and major crypto support from Vitalik Buterin - Machine Intelligence Research Instituteintelligence.org
14.The Open Letter on AI Doesn't Go Far Enoughtime.com
15.2020 Updates and Strategy - Machine Intelligence Research Instituteintelligence.org
16.Late 2021 MIRI Conversations - Machine Intelligence Research Instituteintelligence.org
17.Team - Machine Intelligence Research Instituteintelligence.org
18.Announcing MIRI’s new CEO and leadership team - Machine Intelligence Research Instituteintelligence.org
19.MIRI’s 2024 End-of-Year Update - Machine Intelligence Research Instituteintelligence.org
20.Written statement of MIRI CEO Malo Bourgon to the AI Insight Forum - Machine Intelligence Research Instituteintelligence.org
21.MIRI Communications Team 2024 Recap - Machine Intelligence Research Instituteintelligence.org
22.Our Team — MIRI Technical Governance Teamtechgov.intelligence.org
23.MIRI's 2025 Fundraiser - Machine Intelligence Research Instituteintelligence.org
24.#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcastlexfridman.com
25.The Case for AI Doom Rests on Three Unsettled Questionslawfaremedia.org
26.Research | MIRI Technical governance teamtechgov.intelligence.org
27.Eliezer Yudkowsky - Why AI Will Kill Us, Aligning LLMs, Nature of Intelligence, SciFi, & Rationalitydwarkesh.com
28.AGI Ruin: A List of Lethalities - Machine Intelligence Research Instituteintelligence.org
29.All Publications - Machine Intelligence Research Instituteintelligence.org
30.Review of Scott Alexander's book review of "If Anyone Builds It, Everyone Dies"blog.ninapanickssery.com
31.Concerning MIRI’s Place in the EA Movementthingofthings.wordpress.com
32.An International Agreement to Prevent the Premature Creation of Artificial Superintelligence — MIRI Technical Governance Teamtechgov.intelligence.org
33.MIRI Technical Governance Team | MIRI TGTtechgov.intelligence.org
34.About Memindingourway.com
35.Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrongbenthams.substack.com
36.Eliezer Yudkowsky's Long History of Bad Ideasrealtimetechpocalypse.com
37.Research Guide - Machine Intelligence Research Instituteintelligence.org
38.Donate - Machine Intelligence Research Instituteintelligence.org
39.I Vouch For MIRIthezvi.substack.com
40.Blog - Machine Intelligence Research Instituteintelligence.org
41.Media - Machine Intelligence Research Instituteintelligence.org
42.Eliezer Yudkowsky on the Dangers of AI - Econlibecontalk.org
43.Eliezer Yudkowsky - Human Augmentation as a Safer AGI Pathway (AGI Governance, Episode 6) - Daniel Faggelladanfaggella.com
44.Why Eliezer Yudkowsky’s Time Op-Ed on How Current AI Systems Will Kill Us All Is Even More Unhinged than You Thinkbetweendrafts.com
45.Grant announcement from the Open Philanthropy Project - Machine Intelligence Research Instituteintelligence.org
46.Summary of and Thoughts on the Hotz/Yudkowsky Debatethezvi.substack.com
47.The AI Alignment Problem: Why It's Hard, and Where to Start - Machine Intelligence Research Instituteintelligence.org
48.Will AI Kill us All? Nate Soares on His Controversial Bestsellercarnegieendowment.org
49.The co-founder of Skype invested in some of AI’s hottest startups — but he thinks he failedsemafor.com
50.Summary of “If Anyone Builds It, Everyone Dies”ai-frontiers.org
51.AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions - Machine Intelligence Research Instituteintelligence.org
52.MIRI Newsletter #123 - Machine Intelligence Research Instituteintelligence.org
53.The Problem - Machine Intelligence Research Instituteintelligence.org
54.Yudkowsky and MIRIjefftk.com
55.Why AI Alignment Is 0% Solved — Ex-MIRI Researcher Tsvi Benson-Tilsenlironshapira.substack.com
56.Former MIRI Researcher Solving AI Alignment by Engineering Smarter Human Babieslironshapira.substack.com
57.Comments on MIRI’s The Problembayesianinvestor.com
58.‘The AI Doc’ Is Probably the Scariest Movie You’ll See All Yearkqed.org
59.Malo BOURGONscai.gov.sg
60.THE RATIONALITY TRAPaipanic.news
61.The Doomers’ Dilemmaaipanic.news