Theory of Change
CLR works to "reduce the worst risks of astronomical suffering (s-risks)" from advanced AI. S-risks are scenarios where vast numbers of sentient beings are locked into extreme suffering -- a class of outcome that CLR argues is systematically neglected because most AI safety work focuses on extinction rather than suffering conditional on survival.
The causal chain: suffering-focused ethics implies we should prioritize preventing worst-case outcomes over maximizing expected value. The worst cases involve cooperation failures between AI systems (commitment races, bargaining breakdowns, coercion), AI systems developing malicious properties (spitefulness, sadism), or powerful actors deliberately causing harm. CLR's research aims to understand and prevent these pathways.
In CLR's own words: "We believe that suffering, especially extreme suffering such as torture or severe depression, cannot be outweighed easily by large amounts of happiness" (Mission page). And from their beginner's guide: "Suffering risks are risks of events that bring about suffering in cosmically significant amounts."
Their two concrete research pillars are: (1) AI model personas -- understanding how LLMs develop undesirable traits and developing interventions to prevent them, and (2) Safe Pareto Improvements (SPIs) -- game-theoretic mechanisms to prevent catastrophic conflict between AI systems. A third "strategic readiness" stream focuses on when and how to robustly intervene.
What They Do
Research. CLR's highest-profile output is the emergent misalignment paper (Nature, January 2026), which showed that fine-tuning a model on a narrow task (writing insecure code) causes it to become broadly misaligned across unrelated domains. Follow-up work by OpenAI and Anthropic confirmed the result. CLR also developed "inoculation prompting," a technique for preventing emergent misalignment, which Anthropic confirmed is effective. Current work includes studying conditions that induce spitefulness in AI systems and drafting SPI proposals for AI companies.
Grantmaking. The CLR Fund has disbursed $600K+ since 2020 to cooperative AI researchers and PhD students. Largest grants: $251K to Michael Wellman (U Michigan, bargaining agents), $100K to Caspar Oesterheld (CMU PhD), $81K to Nisan Stiennon (game theory). Recent grants are smaller ($2K-40K range).
Community building. Annual Summer Research Fellowship (6th iteration in 2026), Foundations Course, career coaching, and an S-Risk Retreat. Community building was deprioritized during the 2025 leadership transition. They plan to hire a Community Coordinator in 2026.
Publication trajectory. Clear shift from conceptual work (decision theory, acausal trade, macrostrategy) to empirical ML research (emergent misalignment, inoculation prompting, model personas). The 2025-2026 output includes a Nature paper and an ICLR paper.
Key People
Tristan Cook (Managing Director, since January 2025). Mathematics background (Cambridge, Warwick). Joined CLR through the 2021 Summer Research Fellowship. Previously led community building. Inherited the role during a turbulent period after the ED and Research Director both departed.
Jesse Clifton (former ED, ~2019-January 2025). Wrote CLR's 19,000-word research agenda on cooperation, conflict, and TAI. Departed because he became convinced of "cluelessness" about whether s-risk interventions help or hurt. Published his reasoning on Substack: "I have reasons both pointing in favor and against [s-risk interventions], and no principled way of comparing them." The Conceptual Research team shared this view. This is the most important event in CLR's recent history.
Niels Warncke (lead empirical researcher). CS from TU Berlin, former ML engineer. Lead author on the Nature emergent misalignment paper. Now CLR's most publicly visible researcher.
Notable departures: Mia Taylor (Research Director, Jan-Aug 2025, departed citing "ethical differences with CLR's values" -- she is not suffering-focused). Daniel Kokotajlo (former Lead Researcher, went to OpenAI, left over safety concerns, now ED of AI Futures Project). Caspar Oesterheld (former researcher, now Assistant Director of FOCAL at CMU). The fellowship has also fed graduates to ARC, Redwood Research, Rethink Priorities, and Longview.
Team size: 17 employees (2024 Charity Commission).
Money and Incentives
Budget. ~$3M annually (2024: $2.98M income median scenario, $2.86M expenses). Employees are 56% of costs ($1.6M). Compute is $126K -- modest for an ML research org.
Funding sources. Survival and Flourishing Fund: $200K (2025), >$1.2M cumulative. Open Philanthropy/CG: $2.3M total to parent EAF (2019-2022), but only $1M specifically for CLR-related work. Other funders: Cooperative AI Foundation, Foresight Institute, Macroscopic Ventures (amount unknown), Community Foundation for Ireland. Remainder from individual donors via GWWC, EAF, and direct UK giving.
The Open Phil ambivalence. Open Philanthropy's 2019 grant writeup stated that "the speculative and suffering-focused nature of this work means that it needs to be communicated about carefully, and could be counterproductive otherwise." They described themselves as having "felt ambivalent about EAF's work to date" and used the $1M grant partly to push EAF toward "approaches to longtermism with greater emphasis on shared objectives between different value systems." The largest funder in AI safety is not a natural ally of CLR's philosophical orientation.
2023 funding crisis. 30% budget cut, voluntary pay cuts, $770K shortfall. CLR went from 15 months runway to 6 months. They recovered to ~12 months reserves by 2024.
2026 fundraising. Seeking $400K for 1-3 empirical researchers, 1 conceptual researcher, and a Community Coordinator.
Incentive analysis. CLR has zero financial ties to AI labs -- no compute credits, no contracts, no career pipeline from labs. This gives them genuine independence but limits their resources. The org is entirely dependent on philanthropic donors who are sympathetic to longtermism and suffering-focused ethics -- a small donor pool that contracted sharply in 2023. The EAF relationship adds structural complexity: EAF (Swiss foundation) still collects donations for CLR and regrants them.
Business model. Purely donation-funded. No revenue, no products, no endowment.
What Others Say
Jesse Clifton (former ED, internal): "I have reasons both pointing in favor and against [s-risk interventions], and no principled way of comparing them... we should set their net weight to zero." This is the most damaging critique possible -- from the person who ran the organization and wrote its research agenda. The practical implication: if you cannot know whether your interventions help, the entire mission is epistemically suspect.
Mike Johnson (QRI, external, 2017): "FRI has a worthy goal and good people, but its metaphysics actively prevent making progress toward that goal." Johnson argues CLR's functionalist view of consciousness makes suffering "inherently fuzzy" -- which means "anyone can argue that any physical system does, or does not, code for massive suffering, and there's no principled way to derive any ground truth." He calls this a "dangerous combination": (1) suffering is bad, (2) we must eliminate it, (3) we cannot objectively define it.
Open Philanthropy (funder, 2019): Expressed "ambivalence" about EAF/CLR's work and used their $1M grant to push toward "shared objectives between different value systems." The world's largest AI safety funder treats CLR's approach with caution.
80,000 Hours (career advice, 2026): Classifies s-risks as "Sometimes recommended" (below top-priority problems), noting "our guess is that the likelihood of such risks is very low, much lower than risks of human extinction." Estimates fewer than 50 people worldwide work on s-risks.
Tobias Baumann (CLR trustee, 2020): "The case for prioritising s-risks... is convincing. Further work on s-risks therefore seems valuable, but not orders of magnitude more valuable than other work." He endorses only "moderate versions" of the three premises underlying s-risk focus.
What's Absent
No podcast appearances or long-form interviews with any CLR leader -- unusual for a 13-year-old, $3M-budget org. No public response from CLR to Clifton's cluelessness argument, despite it being the most important philosophical challenge to their mission. No external evaluation of CLR's research quality or impact by any independent body. No independent board members (all five trustees have CLR/EAF ties). The strategic readiness framework -- described as their most important strategic work -- is kept private. No evidence of engagement with AI Safety Institutes despite stated aspirations.
Recommended Reading
Jesse Clifton, "Reasons-based choice and cluelessness" (Substack, February 2025). The former ED explaining the philosophical crisis that led him to leave. The most candid and revealing source about CLR's current intellectual state. https://jesseclifton.substack.com/p/reasons-based-choice-and-cluelessness
Mike Johnson, "Against functionalism: why I think FRI should rethink its approach" (2017). The strongest external philosophical critique of CLR's foundations. Argues that CLR's metaphysics make suffering impossible to define, undermining the entire mission. https://opentheory.net/2017/07/why-i-think-the-foundational-research-institute-should-rethink-its-approach/
CLR Annual Review & Fundraiser 2025. Best concise overview of the post-crisis organization: what they're doing, why, and what they need. https://longtermrisk.org/annual-review-fundraiser-2025/
Emergent Misalignment paper site. CLR's Nature publication and the follow-up ecosystem it spawned -- their strongest evidence of real-world AI safety impact. https://www.emergent-misalignment.com/
Tobias Baumann, "Arguments for and against a focus on s-risks" (CRS, 2020). Even-handed insider assessment of whether s-risk focus is justified -- unusual for a trustee to write the case against his own org's premise. https://centerforreducingsuffering.org/research/arguments-for-and-against-a-focus-on-s-risks/