Daniel Kokotajlo

Independent Analyst

Insider → public voice. Equity forfeiture. Timeline forecasts.

HQ: Berkeley, CA
Team: 1
Structure: individual
Model: Donations

Theory of Change

Kokotajlo's personal theory of change is to make the trajectory toward superintelligence visible and legible to policymakers and the public, creating political conditions for coordination over racing. He pursues this through four channels simultaneously:

Scenario forecasting -- making abstract risk concrete through detailed, date-specific narratives (AI 2027, the forthcoming AI 2030)
Tabletop exercises -- giving decision-makers visceral experience of AGI-era crises (35+ exercises with congressional staffers, lab researchers, journalists)
Whistleblowing and advocacy -- establishing accountability norms for AI labs (Right to Warn, amicus brief against OpenAI's for-profit conversion, SB 1047 support)
Policy development -- advocating specific mechanisms: hardware verification, transparency requirements, international coordination with mutual inspection

He is not trying to solve alignment directly. He is trying to buy time and change the political landscape so that alignment can be solved.

From the 80K Hours podcast (Jan 2026): "In the future, whoever controls all the AIs does not need humans. If you've only got one to five companies and they each have one to three of their smartest AIs in a million copies, then that means there's basically 10 minds that between those 10 minds get to decide almost everything."

What They Do

AI 2027 (April 2025): A 71-page scenario forecast reaching 1M+ readers. VP JD Vance and leaders of top AI companies reportedly read it. Written with Scott Alexander, Eli Lifland, Thomas Larsen, and Romeo Dean. Accompanied by five quantitative forecasts (compute, timelines, takeoff, AI goals, security). Informed by 30+ tabletop exercises. Reviewed by Carl Shulman, Helen Toner, Holden Karnofsky, Yoshua Bengio, and 60+ others.

Right to Warn (June 2024): Co-organized letter by former OpenAI employees calling for transparency. Endorsed by Bengio, Hinton, Russell. Directly influenced the AI Whistleblower Protection Act (bipartisan, introduced May 2025 by Grassley, Coons, Blackburn, Klobuchar). The AIWPA provides anti-retaliation protections, job restoration, double back pay, and prohibits contractual waivers of whistleblower rights.

OpenAI departure (April 2024): Resigned after "losing confidence in responsible AGI development." Refused non-disparagement clause, risking ~$2M equity (~85% family net worth). OpenAI reversed the policy; Altman called it "embarrassing."

Amicus brief (April 2025): Filed with 11 other former employees and Harvard professor Lawrence Lessig opposing OpenAI's for-profit conversion.

MATS mentorship (Summer 2026): Running an AI Futures Project stream at MATS, training next-generation researchers in scenario forecasting.

Forecasting track record: His 2021 "What 2026 Looks Like" anticipated chain-of-thought prompting, inference-time scaling, AI chip export controls, and $100M training runs before ChatGPT. His AGI median has shifted from 2027 (late 2022) to Dec 2030 (Jan 2026) in response to METR studies and model corrections.

Key People

Daniel Kokotajlo -- Executive Director, AI Futures Project. BA Philosophy (Notre Dame), MA Philosophy (UNC Chapel Hill). Career: AI Impacts (2019) -> CLR (~2020-21) -> OpenAI governance (2022-2024) -> AIFP (2025-). TIME 100 AI 2024 and 2025.

Eli Lifland -- Research Lead. #1 on RAND Forecasting Initiative all-time leaderboard. Co-founded Sage, worked on Elicit. His own AGI median is Jan 2035 -- significantly longer than Kokotajlo's.

Thomas Larsen -- Research Lead. Founded Center for AI Policy. Former MIRI researcher.

The core team is 5 people. Kokotajlo's position within the broader OpenAI safety exodus is significant: he was among the first to publicly resign, preceding departures by Jan Leike (Superalignment co-lead), Leopold Aschenbrenner, and eventually Mira Murati (CTO) and Ilya Sutskever. He reported that nearly half the ~30-person AGI safety staff had left by August 2024.

Money and Incentives

AI Futures Project: 501(c)(3) nonprofit, EIN 99-4320292 (previously Artificial Intelligence Forecasting Inc). Headquartered in Berkeley, CA. Needs $1.9-4.7M for 2026 operations.

Funding sources: Unknown. No grants found from Coefficient Giving, SFF, or any identified funder. Manifund page inaccessible. Accepts donations through every.org and DAFs. Website built by Lightcone Infrastructure; Scott Alexander volunteered writing; FutureSearch provided independent forecasting -- suggesting significant in-kind contributions from the rationalist/EA ecosystem.

Personal financial situation: Kokotajlo risked ~$2M OpenAI equity (retained after OpenAI reversed its NDA policy). He has skin in the game: offers monetary bets on predictions, pays bounties for errors found in his models ($500 to titotal, $500 to Peter Johnson).

Incentive analysis: Kokotajlo's incentives appear well-aligned with his stated mission. The equity sacrifice runs against financial self-interest. His main incentive risk is reputational: credibility depends on forecasting track record, which could bias toward defending past predictions. The recent 3-year timeline lengthening mitigates this concern. He does not receive funding from any AI lab.

What Others Say

Gary Marcus (external skeptic): AI 2027 is "a work of fiction, not science" and "a house of improbable longshots." More compellingly: it is "practically marketing materials" for AI companies and "puts wind in their sails." The China framing "feeds the worst fears of hawks, both in the US and China." Verdict: "Tall tales about the imminence of AGI aren't slowing down the AI race dynamic... they are speeding that very dynamic up."

Arvind Narayanan (Princeton, paradigm counter): "There is a long causal chain between AI capability increases and societal impact. Benefits and risks are realized when AI is deployed, not when it is developed." External bottlenecks (regulation, adoption, organizational change) cannot be overcome simply by improving AI's technical design, even by superintelligence. Acknowledges Kokotajlo's tech predictions were accurate but social impact predictions were "overall not directionally correct."

MIRI (insider critique): Agrees on direction but expects "more crazy, both in the sense of chaotic and in the sense of insane." The Slowdown ending is an "exercise in hope." The Race ending is "the only realistic path presented." Expects Agent-4 level models to "try to self-exfiltrate and set up dead-man's switches."

Vitalik Buterin: Accepts rapid capability gains as possible but argues AI 2027 "underrates defensive tech." In a world that cures cancer by 2029, defensive technologies also advance. Warns against the "one AI hegemon" strategy.

titotal (technical critic): Found specific bugs in timelines model, including a code error shifting the median by 9 months. AIFP acknowledged errors, paid $500 bounty, wrote 15K-word response.

Scott Alexander (collaborator): Notes team members have longer timelines than Daniel. "Skeptical" of very fast automation. Describes Slowdown ending as "exercise in hope."

Saffron Huang / Jessica Dai: AI 2027 "obscures human leverage and uses the authors' credentials to make doom seem inevitable."

What's Absent

AIFP funding sources: The single biggest factual gap. For an org seeking $1.9-4.7M that advocates transparency, the opacity is notable.
AIFP governance: No public board, no governance documents. Too new for 990s.
Current p(doom): The 70% figure (2024) is widely cited but likely outdated. If updated to 20-30%, this would be a significant shift.
OpenAI internal work: Kokotajlo's specific governance research outputs remain opaque.
Evaluation of 2025 predictions: The near-term predictions from AI 2027 should now be checkable, but no systematic public assessment was found.
Lab leader reactions: No specific public statements from Altman, Amodei, or Hassabis about AI 2027 were found.

Stated Theory of Change

Kokotajlo's stated theory of change has two layers:

Personal layer: Make the trajectory toward superintelligence legible to decision-makers who currently do not understand what is coming. The mechanism is scenario forecasting -- translating abstract timelines and capability curves into concrete narratives that policymakers, journalists, and the public can viscerally understand. The tabletop exercises extend this by making the experience interactive rather than passive.

Organizational layer (AI Futures Project): Produce a library of branching scenario forecasts, develop quantitative timelines and takeoff models, run tabletop exercises for policymakers, and eventually publish normative policy recommendations ("the policy playbook"). The long-term vision is a comprehensive scenario infrastructure that covers multiple timelines, political paths, and outcomes.

The causal chain is: scenario forecasting + TTX -> legibility of risk -> political will for regulation -> domestic regulation -> international coordination -> hardware verification + transparency -> slower/safer deployment -> time for alignment research to succeed.

Revealed Theory of Change

The revealed theory of change is somewhat broader than the stated one. Kokotajlo's actual activities suggest he operates a four-pronged strategy:

Information production (AI 2027, timelines models, AI 2030): Creating artifacts that make AI risk concrete
Information forcing (whistleblowing, Right to Warn, AIWPA): Creating legal/institutional channels for safety information to flow from labs to public
Legal/structural opposition (amicus brief, SB 1047 support): Using courts and legislation to constrain AI lab behavior
Community building (MATS mentorship, scenario contests, encouraging others to write scenarios): Scaling the approach beyond himself

The stated and revealed theories of change are closely aligned, which is unusual. The main divergence is that the stated theory emphasizes forecasting, while the revealed theory includes significant legal and political action that goes beyond pure information production.

One notable aspect: Kokotajlo is not trying to do alignment research himself. He is a philosopher and forecaster, not a machine learning researcher. His theory of change is explicitly about the political and informational prerequisites for alignment work to succeed, not the alignment work itself. This is a deliberate and defensible strategic choice.

Key Assumptions

Assumption 1: Timelines are short enough that urgency is warranted (confidence in assumption: medium) If AGI is 50+ years away, Kokotajlo's urgency-driven approach wastes resources that could be better deployed on slower institution-building. His own median (Dec 2030) makes this assumption plausible but far from certain -- he assigns meaningful probability (10-20%) to timelines beyond 10 years.

Evidence for: METR horizon-length trend extrapolation, lab CEO statements, Metaculus/Manifold aggregate predictions
Evidence against: Narayanan's diffusion barriers, history of AI winters, GPT-5 as "on trend but not a big deal"
Testable: Yes, through METR trend, near-term AI 2027 predictions, economic productivity data

Assumption 2: Making risk legible changes behavior (confidence: medium-low) This is the deepest assumption. Even if everyone understands the risk, it does not follow that behavior changes. Marcus's counterargument -- that AI 2027 actually accelerates the race by serving as marketing material -- directly challenges this. The track record is mixed: Right to Warn -> AIWPA is a clear success; but the broader policy environment (deregulation, AI race rhetoric) has not shifted toward coordination.

Evidence for: Right to Warn -> AIWPA pipeline, JD Vance reading AI 2027
Evidence against: No meaningful US AI regulation, continued race dynamics, OpenAI proceeding with for-profit conversion
What changes if wrong: Kokotajlo's entire approach loses its foundation. If information production does not change behavior, he needs a different lever.

Assumption 3: International coordination is achievable (confidence: low) Kokotajlo's ideal outcome requires US-China cooperation on AI governance, including mutual hardware verification. The current geopolitical reality offers little support for this. His own tabletop exercises show that most games end in race dynamics, not coordination.

Evidence for: Historical precedent (nuclear arms control), demand for deals in TTX games
Evidence against: Current US-China relations, Taiwan tensions, no existing institutional framework
Testable: Partially, through diplomatic developments over next 2-3 years

Assumption 4: The "software-only singularity" can be slowed by political action (confidence: medium) If the intelligence explosion happens entirely within a few data centers in a few months, political action is too slow to intervene. Kokotajlo's theory of change requires either (a) enough warning time for politics to matter, or (b) sufficient regulation already in place when the explosion begins. The software-only singularity described in AI 2027 leaves very little time for political response.

Strengths

Credibility from sacrifice: The $2M equity risk is among the strongest credibility signals in AI safety. Unlike most commentators, Kokotajlo has paid a personal price for his views.

Forecasting track record: "What 2026 Looks Like" is a documented case of accurate directional prediction on specific technical milestones, written before ChatGPT. This distinguishes him from the many people who make vague predictions about AI.

Intellectual honesty under pressure: His willingness to publicly lengthen timelines (2027 -> 2030), acknowledge model errors, pay bounties for criticism, and express genuine uncertainty (30-40% on current paradigm plateauing) sets him apart from advocates on both sides.

Concrete policy vision: Unlike many AI safety advocates who stop at "we need regulation," Kokotajlo names specific mechanisms (hardware verification, chip tracking, transparency requirements) and specific use cases (mutual inspection for international treaties).

Demonstrated real-world impact: The Right to Warn -> AIWPA pipeline is the clearest causal chain from AI safety advocacy to concrete legislation that I have seen in any org analysis.

Reach beyond the bubble: AI 2027 reached 1M+ readers and was read by a sitting US Vice President. The TTX program has engaged congressional staffers and journalists, not just AI researchers. This is unusually broad for AI safety work.

Weaknesses and Risks

Counterproductivity risk: Marcus's argument that AI 2027 serves as "marketing materials" for AI labs is the most serious challenge. If the net effect of making superintelligence seem imminent is to increase investment in racing rather than in safety, Kokotajlo's work is counterproductive regardless of its accuracy. He acknowledges this concern but cannot resolve it empirically.

Scenario forecasting is fundamentally a single-point estimate: Despite disclaimers about branching futures, the narrative format of AI 2027 creates a psychological sense of inevitability. MIRI's critique about the "planning fallacy" in sequential scenario construction is apt -- if you imagine the most plausible next step at each point, you systematically underestimate delays.

AIFP is fragile: A 5-person team with opaque funding, no visible board, and one public-facing figure creates extreme key-person risk. If Kokotajlo's credibility were damaged (e.g., by a dramatically wrong near-term prediction), AIFP has no organizational resilience.

The gap between reaching policymakers and changing policy: Having JD Vance read AI 2027 is not the same as JD Vance acting on it. The US policy landscape remains dominated by deregulation and AI race rhetoric. There is a risk that Kokotajlo's work creates the appearance of policy engagement without the substance.

Narayanan's deployment/diffusion challenge: Even if AI capabilities advance rapidly, the "long causal chain between capability and societal impact" means that the world might look much more normal than AI 2027 predicts for much longer than expected. Narayanan conceded Kokotajlo's tech predictions were accurate but his social impact predictions were not. If social impacts continue to lag capabilities, the political urgency Kokotajlo needs may never materialize.

Echo chamber risk: AIFP was built by Lightcone, reviewed by EA/rationalist community members, and operates within the AI safety ecosystem. Despite the 60+ reviewers, the core worldview originates from a relatively narrow intellectual community. The Hacker News reaction shows how the broader tech community receives this work.

Cross-References

Complementary to: MIRI (shares worldview, different focus), Anthropic's RSP framework (AIFP advocates for lab commitments), Center for AI Safety (different approach to same goal), PauseAI (more radical version of similar theory of change).

In tension with: Narayanan / AI Snake Oil (fundamentally different causal model), mainstream AI policy establishment (which focuses on near-term harms rather than existential risk), AI lab leadership (which claims to manage risk internally).

Overlaps with: The Midas Project (accountability/transparency advocacy), Center for AI Policy (policy advocacy, Thomas Larsen connection), METR (uses their evaluations as key input data).

Unique positioning: Kokotajlo occupies a genuinely unique position as a former lab insider with a documented forecasting track record who is now an independent advocate. There is no one else doing exactly this combination of scenario forecasting + whistleblowing + TTX + policy advocacy.

What Would Change This Assessment

Near-term predictions proving dramatically wrong: If AI 2027's 2025-2026 predictions are clearly inaccurate by mid-2026, his forecasting credibility declines sharply.
A major policy win attributable to his work: If AIWPA passes and is used, or if hardware verification funding increases due to his advocacy, the theory of change strengthens.
AI progress clearly plateauing: If the METR horizon-length trend bends downward, the urgency case weakens substantially.
AIFP funding transparency: Learning who funds AIFP would allow assessment of financial independence.
Lab behavior changing in response to AI 2027: If major labs cite AI 2027 scenarios in their safety cases or modify behavior based on TTX outcomes, the information-production theory of change is validated.
Counterproductivity evidence: If investment in AI racing demonstrably increased because of AI 2027 framing (hard to measure), the Marcus critique would be validated.

Self-Critique

Limitations of this analysis:

I could not access Kokotajlo's own LessWrong/AF posts directly (hundreds of posts/comments that constitute primary sources for his thinking). I relied on secondary accounts and the Grokipedia article.
AIFP's funding is completely opaque. My positive assessment of incentive alignment could be wrong if there are undisclosed funders with agendas.
I may be giving too much weight to the equity sacrifice as a credibility signal. It is genuinely unusual, but it happened two years ago, and the money was ultimately retained.
The counterproductivity question is fundamentally unresolvable with available evidence. I have no way to measure whether AI 2027's net effect on the AI race is positive or negative.
I may be underweighting the "echo chamber" concern. The 60+ reviewers of AI 2027 are mostly drawn from the EA/rationalist community, and the positive reception may reflect in-group validation rather than genuine external credibility.

What a thoughtful disagreer would say: "Kokotajlo is a talented scenario writer who has leveraged a single accurate prediction and a dramatic resignation into outsized influence. AI 2027 is sophisticated science fiction, not science. The real damage is not that his predictions are wrong -- it is that his predictions normalize the idea that superintelligence is imminent, which accelerates the very race he claims to oppose. The policy impact (AIWPA) is real but modest. The TTX program preaches to the converted. And the entire enterprise depends on short timelines being correct -- if they are not, the opportunity cost of the attention and resources directed toward Kokotajlo's urgent framing is enormous."

My weakest claim: That Kokotajlo's work has net positive impact on AI safety outcomes. The counterproductivity argument is strong enough that I cannot rule out net negative impact.

Information that would most change my view: A systematic assessment of AI 2027's near-term predictions against reality, combined with evidence about whether AI investment or AI safety investment increased more in response to the publication.

Connected to (10)

MATSadvisor at · Daniel Kokotajlo

Astral Codex Tencollaborator · Scott Alexander

FutureSearchcollaborator

Lightcone Infrastructurecollaborator

The Midas Projectcollaborator

OpenAIstaff from · Daniel Kokotajlo

Center on Long-Term Riskstaff from · Daniel Kokotajlo

AI Impactsstaff from · Daniel Kokotajlo

Center for AI Policycollaborator · Thomas Larsen

Machine Intelligence Research Institutestaff from · Thomas Larsen

Sources (58)

Every URL that was read during research.

1.Daniel Kokotajlo (researcher) - Wikipediaen.wikipedia.org
2.Daniel Kokotajlotime.com
3.About — AI Futures Projectaifutures.org
4.AI Futures Projectaifutures.org
5.About — AI 2027ai-2027.com
6.The “AI 2027” Scenario: How realistic is it?garymarcus.substack.com
7.My response to AI 2027vitalik.eth.limo
8.Daniel Kokotajlo on what a hyperspeed robot economy might look like | 80,000 Hours80000hours.org
9.Our first project: AI 2027blog.ai-futures.org
10.OpenAI Safety Worker Quit Due to Losing Confidence Company "Would Behave Responsibly Around the Time of AGI"futurism.com
11.AI 2027: month-by-month model of intelligence explosion — Scott Alexander & Daniel Kokotajlodwarkesh.com
12.AI 2027: Responsesthezvi.substack.com
13.Thoughts on AI 2027 - Machine Intelligence Research Instituteintelligence.org
14.About - AI Futures Projectblog.ai-futures.org
15.OpenAI Insider Estimates 70 Percent Chance That AI Will Destroy or Catastrophically Harm Humanityfuturism.com
16.Former Engineer At Sam Altman-Led OpenAI Says He Resigned After Losing Confidence In The Company: 'Silencing Researchers And Making Them Afraid Of Retaliation Is Dangerous…'benzinga.com
17.OpenAI’s Sam Altman ‘embarrassed’ over staff exit contractseuronews.com
18.OpenAI whistleblower Daniel Kokotajlo on superintelligence and existential risk of AIgzeromedia.com
19.Why I'm Skeptical of AGI Timelines (And You Should Be Too)ignorance.ai
20.AI Futures Project - Wikipediaen.wikipedia.org
21.TIME100 AI 2025: Daniel Kokotajlotime.com
22.REPORT: The OpenAI Files Document Broken Promises, Safety Compromises, Conflicts of Interest, and Leadership Concerns - Tech Oversight Projecttechoversight.org
23.The ‘OpenAI Files’ push for oversight in the race to AGI | TechCrunchtechcrunch.com
24.Daniel Kokotajlo, Author at Center on Long-Term Risklongtermrisk.org
25.The OpenAI Files — Employee Testimoniesopenaifiles.org
26.The OpenAI Files — Transparency & Safetyopenaifiles.org
27.OpenAI subpoenas another nonprofit opposed to its restructuringsfstandard.com
28.The Midas Projectthemidasproject.com
29.Limitless: An AI Podcast | AI DEBATE: Runaway Superintelligence or Normal Technology? | Daniel Kokotajlo vs Arvind Narayananlimitless.bankless.com
30.BREAKING: The AI 2027 doomsday scenario has officially been postponedgarymarcus.substack.com
31.🌻 forecasts vs. fiction (ft. daniel kokotajlo)jasmi.news
32.Clarifying how our AI timelines forecasts have changed since AI 2027blog.ai-futures.org
33.Exaggerating the risks (Part 18: Introduction to AI 2027) - Reflective altruismreflectivealtruism.com
34.Former OpenAI researcher believes company is "fairly close" to AGI and not prepared for itthe-decoder.com
35.Ex-OpenAI researcher pushes back AGI timeline as progress slows - Capacitycapacityglobal.com
36.AI Futures Project at MATS: Summer 2026matsprogram.org
37.Scenario Scrutiny for AI Policyblog.ai-futures.org
38.The Midas Project - InfluenceWatchinfluencewatch.org
39.OpenAI's leadership shakeup continues as another of Sam Altman’s chiefs follows Ilya Sutskever out the door within hours | Fortunefortune.com
40.AI Whistleblower Protection Actkkc.com
41.A guide to understanding AI as normal technologynormaltech.ai
42.Daniel Kokotajlo (researcher) — Grokipediagrokipedia.com
43.The Departure of Daniel Kokotajlo from OpenAI | Deep Reason AIdeepreason.ai
44.Things Seem To Be Going Somewhat Slower Than The 'AI 2027' Scenario: Daniel Kokotajloofficechai.com
45.Introducing AI 2027astralcodexten.com
46.My Takeaways From AI 2027astralcodexten.com
47.AI 2027: Media, Reactions, Criticismblog.ai-futures.org
48.AI 2027news.ycombinator.com
49.The AI 2027 Scenariotheness.com
50.Daniel Kokotajlo - MATS Mentormatsprogram.org
51.Protecting AI Whistleblowerslawfaremedia.org
52.Response to titotal’s critique of our AI 2027 timelines modelaifuturesnotes.substack.com
53.Ex-OpenAI staffers file amicus brief opposing the company's for-profit transition | TechCrunchtechcrunch.com
54.'Disappointed but not surprised': Former employees speak on OpenAI's opposition to SB 1047 | TechCrunchtechcrunch.com
55.AMA With AI Futures Project Teamastralcodexten.com
56.OpenAI Employees Forced to Sign NDA Preventing Them From Ever Criticizing Companyfuturism.com
57.Dozens of AI workers buck their employers, sign letter in support of Wiener AI billsfstandard.com
58.OpenAI accused of using legal tactics to silence nonprofitsnbcnews.com