← AI Safety Orgs

Center for Human-Compatible AI (CHAI)

Conceptual Research

Stuart Russell. CIRL. Academic alignment.

Founded
2016
HQ
Berkeley, CA
Team
60
Structure
university-affiliated
Model
Grants

Theory of Change

CHAI's theory of change has two levels. At the technical level, Russell argues that the "standard model" of AI -- giving machines fixed objectives to optimize -- is fundamentally unsafe. His proposed replacement: machines should be uncertain about human preferences and learn them through observation (Cooperative Inverse Reinforcement Learning / CIRL). In his words: "unless we, in some sense, rip out everything we know about AI and start again and do things in this different way, then things are heading in the wrong direction."

At the institutional level, CHAI aims to demonstrate this paradigm shift through peer-reviewed research, train a generation of safety-minded researchers who carry these ideas into industry, and leverage Russell's personal stature to shape global AI policy.

Russell's three principles for beneficial AI:

  1. The machine's only objective is to maximize the realization of human preferences.
  2. The machine is initially uncertain about what those preferences are.
  3. The ultimate source of information about human preferences is human behavior.

The causal chain: Develop the CIRL framework and adjacent theory -> Train researchers who internalize the paradigm -> Those researchers carry it into DeepMind, Anthropic, OpenAI, and academia -> Russell personally advocates for regulatory structures -> The field gradually shifts from fixed objectives to provably beneficial AI.

What They Do

Research: 32+ papers per year (2022-2023 period). Key areas: assistance games, adversarial robustness, recommender systems, multi-agent cooperation, social impacts. Major result: demonstrated that simple adversarial strategies can beat superhuman Go AIs (ICML-23), illustrating that deep learning systems don't truly "understand" their domains. Foundational CIRL paper (NeurIPS 2016) introduced the formal framework. "Towards Guaranteed Safe AI" (2024) represents a more implementation-oriented evolution: world model + safety specification + verifier.

Talent pipeline: This may be CHAI's most impactful output. Alumni now hold influential positions across the safety ecosystem: Anca Dragan (Head of AI Safety, Google DeepMind), Rohin Shah (AGI Safety lead, DeepMind), Dylan Hadfield-Menell (MIT faculty), Adam Gleave (FAR.AI founder), Rosie Campbell (ex-OpenAI policy), Scott Emmons (Anthropic), Sam Toyer (OpenAI Model Safety), Lawrence Chan (METR). Four PhD completions in the 2022-23 period.

Policy engagement: Russell personally engaged with an extraordinary breadth of policymakers in 2022-2023: UK 10 Downing Street, US Senate (Schumer, Blumenthal, Heinrich), EU AI Act negotiators (instrumental in classifying recommender systems as "high-risk"), France, Singapore, Argentina, Netherlands, China, WEF, OECD, UNESCO, GPAI. Co-authored the FLI "Pause Giant AI Experiments" open letter (March 2023). Senate testimony (July 2023) calling for an FDA-like AI regulatory agency.

Institution-building: IASEAI (International Association for Safe and Ethical AI) founded 2025, inaugural Paris conference: ~700 in-person, 1400 online, keynote by Hinton. Annual CHAI workshops at Asilomar (8th in 2024, 200+ attendees). NSF PSBAI workshop (2022, first government-funded). "Slaughterbots" anti-autonomous-weapons video (70M+ views).

Open source: Modest. GitHub repos include overcooked_ai, tensor-trust, imitation, ranking-challenge. Jonathan Stray's Prosocial Ranking Challenge is the most notable applied project.

Key People

Stuart Russell -- Faculty Director, Founder. Author of the standard AI textbook (AIMA, 1500+ universities). Author of Human Compatible (2019). Professor at UC Berkeley since 1986. Born 1962, British. BBC Reith Lecturer 2021, TIME100 AI 2025, OBE, Fellow of the Royal Society 2025. Also directs the Kavli Center, co-chairs OECD and WEF AI groups, and serves as IASEAI President. Russell IS CHAI -- his personal stature and network are the org's primary asset and its primary vulnerability.

Mark Nitzberg -- Executive Director. Manages day-to-day operations. Computer vision background, Wilson Center fellow. More operationally focused than Russell.

Anca Dragan -- Co-PI, on leave since early 2024 to head AI Safety and Alignment at Google DeepMind. Co-author of the foundational CIRL paper. Her departure is the most significant leadership change in CHAI's history.

Team size: ~9 faculty investigators, 18 affiliate faculty, ~30 graduate/postdoctoral researchers, ~25 PhD students, ~5 staff, ~7 interns/year (Founders Pledge estimate).

Money and Incentives

Total confirmed funding: ~$20.6M from Open Philanthropy (2016-2024), delivered through two channels:

  • $16.9M via UC Berkeley (founding grant $5.56M in 2016 + renewal $11.36M in 2021, both 5-year grants)
  • $3.54M via BERI (ML engineers, compute cluster, internships, operational support, 2017-2024)

Estimated annual budget: ~$3M/year (Founders Pledge). This likely undercounts total resources, since UC Berkeley subsidizes faculty salaries, lab space, IT, and benefits.

Business model: University research center funded primarily by philanthropic grants, with university providing base infrastructure. PhD students funded through standard university mechanisms (TAships, fellowships). External grants supplement rather than replace university support.

Funder concentration: Open Philanthropy is overwhelmingly dominant (~$20.6M). SFF has provided ~$898K. Other listed sponsors (FLI, Leverhulme Trust, CITRIS, NSF) have smaller or unknown amounts. Individual donations accepted via Every.org and Berkeley Big Give. No separate 990 filing -- finances buried in UC Berkeley's consolidated reporting.

BERI as intermediary: BERI (501(c)(3), EIN 81-4820272) serves as fiscal intermediary, providing operational flexibility that UC Berkeley's bureaucracy cannot. BERI hires staff, manages compute, handles logistics. Andrew Critch co-founded BERI in 2017 and works at CHAI part-time.

Incentive analysis: Academic incentives (publications, tenure, prestige) align with producing rigorous theoretical work but may favor publishability over practical safety impact. The talent pipeline to industry is a feature (field-building) but also a constraint (CHAI can't retain top researchers against industry compensation). Russell has no visible financial conflicts with AI labs -- he is a tenured professor without equity stakes, board seats, or consulting ties to frontier labs. This genuine independence is rare among influential AI safety figures.

Key incentive risk: Extreme funder concentration on OP creates dependency. If OP's priorities shifted (e.g., toward more empirical/applied safety work and away from theoretical frameworks), CHAI's funding could be at risk. However, UC Berkeley embedding provides a floor.

What Others Say

The strongest technical critique (MIRI, via Scott Alexander): CIRL's corrigibility depends on the AI being uncertain about human preferences. But once the AI learns enough, uncertainty decreases, and with it the reason to defer to humans. The AI has a "sixth option": refuse shutdown, keep learning, then optimize sovereignly. Russell responds that proper Bayesian priors should prevent absolute certainty. MIRI counters that the issue isn't absolute certainty but the point where expected information value drops below cost -- then the AI acts regardless. Scott Alexander: "their crux seems to be whether the AI could end up with an uncorrectably wrong model of the human utility function."

IRL scalability (Kasenberg): Three fundamental limitations: (1) moral norms are temporally complex, but IRL assumes reward depends only on current state; (2) reward functions are domain-specific and don't transfer; (3) IRL outputs are opaque numbers, not interpretable principles.

"Dumb superintelligence" fallacy (Melanie Mitchell): True intelligence inherently involves common sense, adaptability, and context-sensitivity. A superintelligent entity that simultaneously lacks basic understanding is a contradiction. The paperclip maximizer scenario may be incoherent.

"Too early" (Robin Hanson): Russell hasn't made the case for reorganizing all of AI now. We are too far from knowing how future AI systems will be organized. Like warning about nuclear weapons in 1500.

"Blinkered" rationality (David Leslie, Nature review): Russell reduces intelligence to instrumental rationality and falls prey to "techno-solutionism." His framework ignores holistic, contextual understanding of reasoning.

Positive assessments: Open Philanthropy rates CHAI "one of the highest-impact organizations working on AI alignment in the world." Founders Pledge assessment is broadly positive: "CHAI is especially well-placed to produce reliably positive impact."

What's Absent

  • No financial transparency beyond grant amounts. No separate 990, no published budget breakdown.
  • No succession plan. Russell is 63 with no visible heir. Dragan (most plausible successor) left for DeepMind.
  • No published impact evaluation. After 10 years and $20M+, no systematic assessment of whether CIRL has been adopted, whether alumni have changed practices at labs, or whether policy advocacy produced specific outcomes.
  • No evidence of CIRL adoption in production. The framework is influential conceptually but there is no evidence it has been implemented in any AI product or system.
  • No progress report since May 2023. A two-year gap in public reporting during AI safety's most critical period.
  • Limited engagement with LLM alignment. CHAI's core program was designed for a different AI paradigm. RLHF, constitutional AI, and other LLM-specific techniques are not prominently represented in their publications.
  • Minimal independent evaluation. The strongest endorsements come from within the OP/EA funder ecosystem. External academic evaluations of CHAI's approach are largely absent.

Recommended Reading

  1. Stuart Russell on Diary of a CEO (2025) -- Russell at his most emotionally urgent. Reports private conversations with AI CEOs about extinction risk. Explains the gorilla problem, fast takeoff, and why he's "appalled." The most candid window into his current thinking. https://singjupost.com/stuart-russell-on-the-diary-of-a-ceo-podcast-transcript/

  2. Scott Alexander, "CHAI, Assistance Games, And Fully-Updated Deference" -- Both explains CHAI's approach clearly AND presents the strongest technical counterargument. Required reading. https://www.astralcodexten.com/p/chai-assistance-games-and-fully-updated

  3. Stuart Russell on 80K Hours (#80, 2020) -- The comprehensive intellectual foundation: standard model critique, three principles, counterarguments, policy ideas. More detailed than any other single source on Russell's worldview. https://80000hours.org/podcast/episodes/stuart-russell-human-compatible-ai/

  4. Robin Hanson review of Human Compatible -- The skeptic's case: it's too early, the ask is too large, and Russell hasn't engaged with alternatives. https://www.overcomingbias.com/p/russells-human-compatiblehtml

  5. Founders Pledge assessment of CHAI -- The most thorough independent evaluation available, including team size, research output, and field-building assessment. https://www.founderspledge.com/research/center-for-human-compatible-ai

Show Claude’s analysis
An opinionated read. Read the brief first to form your own view.

Stated Theory of Change

CHAI's stated theory of change is a paradigm shift argument: the way AI is currently developed (the "standard model" of optimizing fixed objectives) is fundamentally unsafe, and CHAI exists to develop and propagate an alternative (provably beneficial AI via CIRL -- machines uncertain about human preferences that learn them through observation).

The specific mechanism: If AI systems are designed to be genuinely uncertain about what humans want, they will naturally be deferential, accept correction, and allow shutdown. This makes them inherently safer than systems pursuing fixed objectives. Russell calls this "a big step because the standard model doesn't say [machines] should be good for human beings at all."

The causal chain runs: Develop theoretical framework -> Prove key properties -> Train researchers in the paradigm -> Textbook adoption propagates the ideas -> Researchers carry ideas into industry labs -> Russell personally advocates for regulatory adoption -> The field shifts.

Revealed Theory of Change

The gap between CHAI's stated and revealed theory of change is instructive.

What they actually produce and optimize for:

  1. Academic publications (32+ papers/year)
  2. PhD graduates placed at top safety positions (their most impactful output)
  3. Russell's personal policy advocacy (hundreds of meetings, testimony, media)
  4. Annual workshops for community building
  5. Russell's public intellectual profile (books, BBC lectures, TIME100, Davos)

What they don't do much of:

  1. Build tools or products that implement CIRL
  2. Work directly with industry labs on adoption
  3. Engage with current LLM alignment practices (RLHF, constitutional AI)
  4. Produce quantitative impact evaluations
  5. Build institutional capacity beyond Russell

The revealed theory of change is closer to: "Stuart Russell uses his extraordinary personal credibility as the textbook author to keep the provably-beneficial-AI paradigm alive in academic discourse and elite policy circles, while training a generation of researchers who carry safety awareness into industry, whether or not they specifically use CIRL."

This is actually a reasonable theory of change -- possibly more realistic than the stated one. The field-building through talent placement is demonstrably real. Whether the CIRL framework itself will matter depends on whether future AI systems are amenable to that approach.

Key Assumptions

Assumption 1: The standard model is the root cause of unsafe AI.

  • Evidence for: YouTube recommender algorithm, specification gaming examples, Russell's extensive catalog of fixed-objective failures.
  • Evidence against: Current LLMs don't obviously work on the "fixed objective" model. RLHF and constitutional AI already incorporate preference learning. The field may be organically moving beyond the standard model without CIRL.
  • Testable: Partially. If LLM alignment methods (not based on CIRL) prove adequate, this assumption loses force.
  • If wrong: CHAI's research program is less urgently needed, though still conceptually interesting.

Assumption 2: CIRL's uncertainty-based corrigibility solves the control problem.

  • Evidence for: Theoretical proofs that uncertain AI prefers shutdown. Intuitive appeal of the "humble machine" concept.
  • Evidence against: MIRI's fully updated deference critique -- once the AI learns enough, uncertainty disappears and corrigibility with it. Kasenberg's scalability objections. No empirical validation at scale.
  • Testable: Yes, in principle. But testing requires AI systems sophisticated enough to exhibit the relevant behaviors.
  • If wrong: CHAI's core technical contribution fails. The talent pipeline and policy advocacy remain valuable.

Assumption 3: Academic research and paradigm advocacy can actually redirect the AI industry.

  • Evidence for: Russell's textbook reaches 1500+ universities. His policy access is real (Senate testimony, EU AI Act influence, OECD co-chair).
  • Evidence against: AI labs are driven by competitive pressure and trillions in investment. Russell himself quotes an unnamed CEO saying they feel "trapped" in the race. Academic frameworks have had limited impact on commercial AI development historically.
  • Testable: Yes, over time. Track whether CIRL-adjacent ideas appear in industry safety practices.
  • If wrong: CHAI is producing good academic work with little real-world impact on the trajectory of AI development.

Assumption 4: The talent pipeline carries CHAI's ideas into practice.

  • Evidence for: Alumni at DeepMind, Anthropic, OpenAI, MIT, Stanford, FAR.AI.
  • Evidence against: Alumni may adopt different approaches once at industry labs. Rohin Shah (CHAI PhD) describes himself as more optimistic than the doom narrative and works on empirical alignment, not CIRL.
  • If wrong: CHAI's field-building is producing generally safety-aware researchers, but not specifically propagating the CIRL paradigm. This is still valuable, just not the stated theory of change.

Strengths

Intellectual credibility. Russell is arguably the single most credible person in the world to argue that AI needs a paradigm shift. He co-authored THE textbook. He has impeccable academic credentials. He cannot be dismissed as an outsider or alarmist.

Independence. Unlike many AI safety figures, Russell has no financial ties to frontier labs. No equity, no board seats, no consulting arrangements. His tenured position at UC Berkeley makes him immune to industry pressure. This is rare and valuable.

Talent production. CHAI has placed alumni in influential safety positions at every major lab and several leading universities. This network effect may compound over time.

Policy access. Russell's personal access to policymakers (heads of state, senators, EU commissioners, OECD, WEF, UN) is extraordinary for an academic. His Senate testimony and EU AI Act influence demonstrate concrete policy impact.

Theoretical foundations. CIRL is a mathematically rigorous framework that provides formal proofs of properties like deferability and shutdown acceptance. Even if imperfect, it provides a target for the field.

Longevity and stability. UC Berkeley embedding provides institutional stability. CHAI has operated continuously since 2016 with no existential funding crises, unlike many safety organizations.

Weaknesses and Risks

Key-person risk is extreme. CHAI is Stuart Russell. His departure, retirement, or incapacitation would eliminate most of the org's policy access, public profile, fundraising capacity, and intellectual leadership simultaneously. There is no visible succession plan. Dragan, the most plausible successor, has left for DeepMind. Russell is 63.

CIRL may be solving a problem that doesn't arise in the LLM paradigm. CIRL was designed for a world of RL-based agents with explicit reward functions. Modern LLMs trained via RLHF already incorporate a version of "learn human preferences through observation." The field may be naturally moving past the standard model without CIRL. CHAI risks being outrun by the very progress it advocated.

No evidence of practical adoption. After 10 years and $20M+, there is no documented case of CIRL being implemented in a production AI system. The framework remains purely academic.

Funder concentration. Open Philanthropy provides virtually all external funding. This creates dependency on a single funder's continued belief in CHAI's approach.

Limited engagement with LLM alignment. CHAI's publication record is dominated by traditional RL/IRL work. The org has been slow to engage with RLHF, constitutional AI, mechanistic interpretability, and other techniques that are actually being deployed at frontier labs.

Policy advocacy flows through one person. Russell's hundreds of meetings with policymakers are impressive but represent a single point of failure for CHAI's policy impact. IASEAI is a step toward institutionalizing this, but the first conference attracted almost no policymakers despite being policy-focused.

Academic incentives may limit impact. CHAI researchers are incentivized by publications and academic career advancement, which may favor publishable theoretical results over messy practical safety work.

Cross-References

Complementary with: MIRI (different approach to same problem -- MIRI focuses on deceptive alignment, CHAI on value alignment), FAR.AI (founded by CHAI alum Gleave, applies CHAI ideas to evaluation), Redwood Research (empirical safety testing), METR (AI evaluations, employs CHAI alum Lawrence Chan).

Talent pipeline to: Google DeepMind (Dragan, Shah, Dennis, Turner, Emmons initially), Anthropic (Emmons), OpenAI (Toyer, Campbell initially), Meta FAIR (Milli), MIT (Hadfield-Menell), Stanford (Sadigh), Princeton (Fisac), CMU (Bajcsy).

Funded by: Open Philanthropy (via BERI and UC Berkeley).

Russell-adjacent orgs: IASEAI (Russell is President), Kavli Center (Russell is Director), BERI (fiscal intermediary, co-founded by CHAI researcher Critch).

Intellectual relationship with MIRI: Russell and Yudkowsky have engaged substantively on the fully updated deference problem. Their crux: whether IRL can produce a utility function close enough to true human values that the resulting AI would be beneficial. MIRI says no (the meta-learning is still wrong enough to be catastrophic). Russell says the gap can be made small enough to be acceptable. This is the central unsettled question in CHAI's research program.

What Would Change This Assessment

Upward revision if:

  • CIRL-adjacent techniques were adopted by a frontier lab for a production system.
  • A clear successor to Russell emerged with comparable credibility and access.
  • CHAI published work demonstrating how CIRL applies to LLM alignment.
  • An independent evaluation confirmed specific policy outcomes traceable to CHAI advocacy.
  • Non-OP funders provided significant diversified funding.

Downward revision if:

  • LLM alignment methods (RLHF, constitutional AI, etc.) proved adequate without CIRL-type frameworks, suggesting CHAI's approach was addressing the wrong problem.
  • Russell retired or became incapacitated with no succession plan executed.
  • OP significantly reduced or ended CHAI funding.
  • Multiple CHAI alumni publicly stated that CIRL was not relevant to their current safety work.
  • CHAI continued not publishing progress reports or impact evaluations.

Self-Critique

Strongest limitation: I have very limited information on CHAI's internal culture, decision-making processes, and how research priorities are actually set. My assessment of key-person risk is based on external observation -- insiders might see more distributed leadership than is visible from outside.

Potential bias: The evidence base skews heavily toward Russell's public statements, which are inevitably more polished and confident than internal discussions would be. The absence of critical EA Forum/LessWrong discussions (due to fetching limitations) may mean I'm underweighting community skepticism about CHAI's approach.

Weakest claim: My claim that CIRL hasn't been adopted in practice rests on absence of evidence, not evidence of absence. Labs may be using CIRL-adjacent ideas under different names, and I may be underestimating the indirect influence of CHAI's framework.

What a thoughtful disagreer would say: "CHAI's value isn't in CIRL specifically -- it's in keeping the 'provably safe AI' research agenda alive in mainstream computer science. The paradigm-level argument (don't give machines fixed objectives) has already been absorbed by the field. You're judging CHAI by whether its specific technical framework got adopted, but the real impact was shifting the Overton window of AI safety within academia. That's hard to measure but genuinely important."

What information would most change my view: An internal assessment from a CHAI researcher or a detailed OP evaluation explaining what CHAI's counterfactual impact has been. Also, evidence of whether CIRL-type reasoning has influenced the design of RLHF or constitutional AI systems at any lab.

Connected to (19)

Anthropicstaff to · Scott EmmonsGoogle DeepMindstaff to · Anca DraganOpenAIstaff to · Sam ToyerFAR.AIspun off from · Adam GleaveGoogle DeepMindstaff to · Alex TurnerGoogle DeepMindstaff to · Michael DennisBerkeley Existential Risk Initiativecollaborator · Andrew CritchGoogle DeepMindstaff to · Rohin ShahMETRstaff to · Lawrence ChanKavli Center for Ethics, Science, and the Publicboard overlap · Stuart RussellOpenAIstaff to · Rosie Campbell
International Association for Safe and Ethical AIboard overlap · Stuart Russell
Carnegie Mellon Universitystaff to · Andrea Bajcsy
Encultured AIstaff to · Andrew Critch
Machine Intelligence Research Institutecollaborator
Metastaff to · Smitha Milli
Massachusetts Institute of Technologystaff to · Dylan Hadfield-Menell
Princeton Universitystaff to · Jaime Fernandez Fisac
Stanford Universitystaff to · Dorsa Sadigh
Sources (65)
Every URL that was read during research.
  1. 1.Abouthumancompatible.ai
  2. 2.Center for Human-Compatible Artificial Intelligencehumancompatible.ai
  3. 3.Center for Human-Compatible Artificial Intelligence - Wikipediaen.wikipedia.org
  4. 4.Stuart J. Russell - Wikipediaen.wikipedia.org
  5. 5.Professor Stuart Russell on the flaws that make today's AI architecture unsafe & a new approach that could fix it80000hours.org
  6. 6.Human Compatible - Wikipediaen.wikipedia.org
  7. 7.Peoplehumancompatible.ai
  8. 8.Jobshumancompatible.ai
  9. 9.Unknownhumancompatible.ai
  10. 10.Unknownhumancompatible.ai
  11. 11.Progress Reporthumancompatible.ai
  12. 12.8th Annual Center for Human-Compatible AI Workshophumancompatible.ai
  13. 13.Anca Dragan named Head of AI Safety and Alignment at Google DeepMind - EECS at Berkeleyeecs.berkeley.edu
  14. 14.Center for Human-Compatible AIfounderspledge.com
  15. 15.AI pioneer Stuart Russell and the Center for Human-Compatible Artificial Intelligenceinspire.berkeley.edu
  16. 16.Center for Human-Compatible Artificial Intelligencehumancompatible.ai
  17. 17.Stuart Russell - Avoiding the Cliff of Uncontrollable AI (AGI Governance, Episode 9) - Daniel Faggelladanfaggella.com
  18. 18.Transcript: Senate Hearing on Principles for AI Regulationtechpolicy.press
  19. 19.The long-term future of AIpeople.eecs.berkeley.edu
  20. 20.AI Researcher Stuart Russell - Future of Life Institutefutureoflife.org
  21. 21.CHAI, Assistance Games, And Fully-Updated Deferenceastralcodexten.com
  22. 22.AI Ethics: Inverse Reinforcement Learning to the Rescue?dkasenberg.github.io
  23. 23.Center for Human-Compatible AI donations receiveddonations.vipulnaik.com
  24. 24.Berkeley Existential Risk Initiativeexistence.org
  25. 25.Big Give 2025givingday.berkeley.edu
  26. 26.Mark Nitzbergwilsoncenter.org
  27. 27.“We need to be careful what we optimize our AI systems for” | Heinrich Böll Stiftung | Washington, DC Office - USA, Canada, Global Dialogueus.boell.org
  28. 28.Russell’s Human Compatibleovercomingbias.com
  29. 29.Review of Human Compatible by Stuart Russellsiliconreckoner.substack.com
  30. 30.Research Publicationschai.berkeley.edu
  31. 31.Stuart Russell -- Publicationspeople.eecs.berkeley.edu
  32. 32.Stuart Russell on The Diary Of A CEO Podcast (Transcript)singjupost.com
  33. 33.Stuart Russell: The Foundations of Artificial Intelligencethegradientpub.substack.com
  34. 34.Stuart Russell on AI Governance - AI Futures Podcast (S1E1) - Emerj Artificial Intelligence Researchemerj.com
  35. 35.AI Safety Is A Global Public Good | NOEMAnoemamag.com
  36. 36.Center for Human-Compatible Artificial Intelligence (CHAI)givingwhatwecan.org
  37. 37.The Fallacy of Dumb Superintelligence: Why Melanie Mitchell Thinks AI Won’t Be as Dangerous as We Fearbuildingcreativemachines.substack.com
  38. 38.Stuart Russell: UC Berkeleydigidai.github.io
  39. 39.Newshumancompatible.ai
  40. 40.Center for Human-Compatible AIgithub.com
  41. 41.Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systemsarxiv.org
  42. 42.Stuart Russelltime.com
  43. 43.Cooperative Inverse Reinforcement Learningarxiv.org
  44. 44.Bloghumancompatible.ai
  45. 45.The path to safe, ethical AI: SRI highlights from the 2025 IASEAI conference in Paris — Schwartz Reisman Institutesrinstitute.utoronto.ca
  46. 46.Highlights from Paris: Attending the 2025 IASEAI Conferenceairisk.mit.edu
  47. 47.How Stuart Russels's IASEAI conference failed to live up to its potential (FBB #8)fieldbuilding.substack.com
  48. 48.IASEAI '25: Key Takeaways from the Inaugural AI Safety & Ethics Conference | Center for AI Policy | CAIPcenteraipolicy.org
  49. 49.Is ‘provably beneficial’ AI possible? - ITUitu.int
  50. 50.Provably Beneficial Artificial Intelligenceberkeleysciencereview.com
  51. 51.Spotlightshumancompatible.ai
  52. 52.CHAI PhD Students Accept Positions at MIT, Princeton, and DeepMindhumancompatible.ai
  53. 53.Rohin Shah on DeepMind and trying to fairly hear out both AI doomers and doubters | 80,000 Hours80000hours.org
  54. 54.About - Rosie Campbellrosiecampbell.xyz
  55. 55.Stuart Russell, Curriculum Vitaepeople.eecs.berkeley.edu
  56. 56.Stuart Russell -- Biographypeople.eecs.berkeley.edu
  57. 57.Unknownamacad.org
  58. 58.Navigating Transformative AIopenphilanthropy.org
  59. 59.7th Annual Center for Human-Compatible AI Workshophumancompatible.ai
  60. 60.Banning Lethal Autonomous Weapons: An Educationissues.org
  61. 61.Why Cheap Autonomous Weapons Should Be Banned — Interview with Stuart Russellzeta-alpha.com
  62. 62.Stuart Russellkavlicenter.berkeley.edu
  63. 63.New Kavli Center at UC Berkeley to foster ethics, engagement in science - Berkeley Newsnews.berkeley.edu
  64. 64.Paris AI Safety Breakfast #1: Stuart Russell - Future of Life Institutefutureoflife.org
  65. 65.Berkeley Professor, AI Expert Stuart Russell Named 2021 BBC Reith Lecturercdss.berkeley.edu