← AI Safety Orgs

Center for AI Safety (CAIS)

Empirical Research

Hendrycks. Statement on AI Risk. Research.

Founded
2022
HQ
San Francisco, CA
Team
13
Structure
501(c)(3) nonprofit
Model
Grants

Theory of Change

CAIS articulates its mission as reducing "societal-scale risks associated with AI" through three pillars: research, field-building, and advocacy.

Their most comprehensive risk framework identifies four categories: malicious use (bioterrorism, autonomous agents), AI race dynamics (military and corporate arms races, evolutionary pressures), organizational failures (accidents, safety culture), and rogue AI (proxy gaming, deception, power-seeking). Each has proposed interventions spanning technical research, regulation, and international coordination.

The theory has evolved significantly. Early CAIS (2022-2023) focused on technical safety research and public awareness (the extinction risk statement). By 2025, the primary contribution is the Superintelligence Strategy paper (co-authored with Eric Schmidt and Alexandr Wang), which proposes MAIM -- Mutual Assured AI Malfunction -- a deterrence regime where states sabotage each other's destabilizing AI projects. Components: deterrence through sabotage threats, nonproliferation to keep weaponizable capabilities from rogue actors, and competitiveness through economic and military AI development.

Hendrycks frames this shift in a March 2025 interview: "I don't think labs have an extremely large role in safety overall... They're kind of predetermined to race... can't really choose not to. Safety is much more of a broader problem. It's got some technical aspects, but I think that's a small part of it." He distinguishes alignment from safety: "China can have AIs that are totally aligned with them. The US can have AIs that are totally aligned with them. You still are going to have a strategic competition."

In the Lawfare Daily transcript (March 2025), Hendrycks describes specific mechanisms he has been socializing in Washington: CIA espionage cells monitoring rival AI programs, CyberCom preparing attacks on adversary data centers, moving data centers outside cities for "city avoidance."

What They Do

Research. Strong publication record for a 3-year-old org with ~10-15 staff. Notable outputs: Circuit Breakers (NeurIPS 2024 -- required 20,000 jailbreak attempts to bypass), WMDP benchmark (ICML 2024), HarmBench (adopted by US and UK AI Safety Institutes for pre-deployment testing), safetywashing paper (NeurIPS 2024, showing most safety benchmarks correlate with general capabilities), Tamper-Resistant Safeguards for open-weight models, Representation Engineering (RepE), Humanity's Last Exam (co-created with Scale AI, 3,000+ expert contributors). Hendrycks also created MMLU -- the most widely used AI capability benchmark -- before founding CAIS. He explicitly opposes mechanistic interpretability in a published article, arguing RepE and circuit breakers are more practical than Anthropic-style neuron-level analysis.

Compute cluster. 80 A100 GPUs supporting ~350 researchers, enabling 109 cumulative papers with 4,000+ citations. Now restricted to Schmidt Sciences AI safety grantees. Arguably CAIS's highest-impact program by volume of output.

Advocacy. The one-sentence Statement on AI Risk (May 2023) -- signed by Altman, Amodei, Hassabis, Hinton, Bengio, and 1,000+ others -- mainstreamed extinction risk. CAIS Action Fund co-sponsored SB 1047 in California, building a coalition of 70+ academics, 77% of CA voters, unions, before Newsom's veto. DC launch event (July 2024) with bipartisan congressional keynotes. $270K in federal lobbying in 2024. Secured $10M congressional funding for the US AI Safety Institute. Hendrycks currently socializing MAIM at the White House.

Field-building. ML Safety course (1,000+ participants), Philosophy Fellowship (7 months, 18 papers), AI & Society Fellowship (3 months, economists/lawyers/IR scholars), AI Safety textbook (Taylor & Francis), SafeBench competition ($250K prizes), AI Safety Newsletter with 43,000+ subscribers.

Key People

Dan Hendrycks -- Executive Director. Born 1994/95, Marshfield MO. Evangelical upbringing shaped his moral catastrophism. PhD UC Berkeley 2022. Created GELU and MMLU. TIME100 AI 2023. Advises xAI ($1/year) and Scale AI ($12/year), no equity. Co-authored Superintelligence Strategy with Eric Schmidt and Alexandr Wang. Co-founded Gray Swan AI but divested August 2024 amid conflict-of-interest criticism. p(doom) reportedly peaked at >80%, now ~50-50. Compensation: $314,534 (2024). Credits 80,000 Hours for career direction but distances from EA: "AI safety has outgrown the EA community."

Nick Beckstead -- Policy Lead (2024, $198K). Formerly CEO of FTX Future Fund and Open Philanthropy program officer. Has since departed to found the Secure AI Project. His path through the EA funding ecosystem is notable.

Jaan Tallinn -- Skype co-founder, primary funder of SFF (which gave CAIS ~$2.8M in 2024), appears in a governance-adjacent role. Creates funder-governance overlap.

Team size: ~10-15 FTEs based on salary data. No public evidence of staff departures in 3+ years.

Money and Incentives

Total budget. 2024 revenue: $10.2M (contributions $9.6M, investment $384K, program services $257K). 2024 expenses: $7.1M. Total assets: $12.6M. Revenue peaked at $16.1M in 2023 during Open Phil grants. 93.8% dependent on contributions.

Funding sources:

  • FTX (2022): $6.5M received May-September 2022. Bankruptcy estate sought clawback; CAIS refused voluntary accounting. Resolution unknown; liabilities dropped from $5.45M (2023) to $1.03M (2024).
  • Open Philanthropy (2022-2023): 4 grants totaling $12.49M. The October 2023 grant labeled "exit grant" with "approximately one year of operational support." No further OP funding. No public explanation for ending the $12.5M relationship.
  • SFF / Jaan Tallinn (2024-2025): ~$2.8M in 2024, ~$1.8M in 2025.
  • Schmidt Sciences: In-kind compute (80 A100 cluster restricted to Schmidt grantees), $10M AI Safety Science program. Schmidt co-authored flagship paper.
  • Unknown: ~$6-7M of 2024 contributions from unidentified sources.

Early funding was almost entirely EA-affiliated: $12.5M from Open Phil + $6.5M from FTX = ~$19M. CAIS now actively repositions toward the national security establishment while publicly distancing from EA.

Incentive structure. Hendrycks advises xAI and Scale AI for nominal pay. Schmidt Sciences provides compute and co-authored the flagship policy paper. Gray Swan AI (co-founded by Hendrycks, divested 2024) would benefit commercially from safety mandates CAIS advocates. These relationships create structural incentives even when financial conflicts are minimized: Hendrycks' credibility as a safety voice depends on industry connections, while those companies gain safety legitimacy from his association.

What Others Say

Strongest case against MAIM: MIRI identifies 5 conditions MAIM must meet for deterrence and argues it falls short -- "breakout distance" between acceptable AI use and decisive strategic advantage is too short, monitoring is impractical, sabotage can only delay (not deny). IAPS estimates only ~25% chance of MAIM dynamics actually occurring. The AI Frontiers observability analysis argues the US and China fundamentally cannot monitor each other's AI development -- you can't deter what you can't detect.

Zvi Mowshowitz calls MAIM "not crazy" but notes "our planetary track record of following through in even the most obvious of situations is highly spotty."

Extinction statement critics argue tech leaders signing the statement benefit from inflated perceptions of AI power. Timnit Gebru called it a "DDoS attack on attention." A Harvard Data Science Review article characterizes the extinction narrative as "a bid for power" using dramatic forms from Greek tragedy.

xAI credibility problem. Hendrycks advises xAI, which released Grok 4 in July 2025 without any safety report despite Seoul summit commitments. Anthropic's Samuel Marks called this "reckless." When the Director of the Center for AI Safety advises a company that does not follow industry safety standards, the gap between advocacy and practice is impossible to ignore.

Even evals critics validate CAIS. A LessWrong post arguing AI evaluation regimes are harmful cites the CAIS extinction statement as "more useful than eval results."

What's Absent

No documented conflict-of-interest policy despite Hendrycks' advisory roles at xAI and Scale AI, Gray Swan co-founding, and Tallinn's funder-governance overlap. No public explanation for why Open Philanthropy ended funding after $12.5M. Approximately $6-7M of 2024 funding from unidentified sources. No evidence of independent board oversight. No staff departures with public statements in 3+ years. The specifics of Hendrycks' xAI advisory role -- hours, influence, actual impact -- are undocumented. Co-founder Oliver Zhang's current role is invisible.

Recommended Reading

  1. No Priors podcast with Hendrycks (March 2025) -- Most candid articulation of why he thinks safety is geopolitical, not technical. https://podscripts.co/podcasts/no-priors-artificial-intelligence-technology-startups/national-security-strategy-and-ai-evals-on-the-eve-of-superintelligence-with-dan-hendrycks

  2. MIRI's "Refining MAIM" critique (April 2025) -- Best-argued case that CAIS's central policy proposal has fundamental implementation flaws. https://intelligence.org/2025/04/11/refining-maim-identifying-changes-required-to-meet-conditions-for-deterrence/

  3. Fortune: xAI releases Grok 4 with no safety report (July 2025) -- The credibility gap. https://fortune.com/2025/07/17/elon-musk-xai-grok-4-no-safety-report/

  4. Boston Globe profile of Hendrycks (July 2023) -- Evangelical upbringing, 80K Hours influence, founding CAIS. https://www.bostonglobe.com/2023/07/06/opinion/ai-safety-human-extinction-dan-hendrycks-cais/

  5. Hendrycks on mechanistic interpretability (May 2025) -- CAIS's intellectual position against Anthropic's largest research bet. https://ai-frontiers.org/articles/the-misguided-quest-for-mechanistic-ai-interpretability

Show Claude’s analysis
An opinionated read. Read the brief first to form your own view.

Stated Theory of Change

CAIS claims to reduce societal-scale AI risks through research, field-building, and advocacy. Hendrycks' interviews articulate a more specific mechanism:

  1. Technical safety research creates practical tools (benchmarks, circuit breakers, RepE) that make systems empirically safer, following "defense in depth" rather than formal guarantees.
  2. Field-building (compute cluster, courses, fellowships, textbook, newsletter) expands the community working on safety.
  3. Policy advocacy shapes regulation and international governance. The flagship is MAIM -- a deterrence framework where states prevent rivals from developing destabilizing AI through credible threats of sabotage.

The causal chain: produce safety tools and human capital -> influence labs and governments -> reduce probability of AI-related catastrophe. The chain now emphasizes the geopolitical level over the technical level.

Revealed Theory of Change

What they actually optimize for: Hendrycks' personal influence and intellectual reach. CAIS is a platform for one researcher's worldview more than an institution. The research agenda follows his intellectual trajectory (benchmarks -> RepE -> circuit breakers -> geopolitics), the policy work reflects his network (xAI, Scale AI, Schmidt), and field-building creates an audience for his ideas (43K newsletter subscribers, textbook).

Divergence from stated theory: The three pillars are presented as roughly equal, but CAIS has pivoted heavily toward advocacy/policy. Hendrycks' 2025 interviews frame technical safety as "a small part" of the solution. Yet CAIS's most concrete, lasting impact has been through technical research (HarmBench adopted by AI Safety Institutes, MMLU as industry standard, circuit breakers surviving 20,000 jailbreak attempts).

The national security pivot: CAIS has repositioned from EA-adjacent safety research to national security establishment influence. The MAIM paper with Schmidt and Wang is the visible marker. This is strategically sound -- after OP's exit grant and EA's reputational damage from FTX, the national security framing opens new funding and policy channels. But it means CAIS's theory of change now depends on the viability of great-power deterrence applied to AI, which is contested.

The advisory model: Hendrycks advises xAI and Scale AI for zero effective pay, presumably to influence frontier lab safety from inside. But xAI's actual safety practices (no Grok 4 system card, antisemitic outputs) suggest minimal influence. The revealed theory through advisory roles is closer to "lend credibility to the lab's safety claims" than "improve the lab's safety practices."

Key Assumptions

1. AI safety is primarily geopolitical, not technical.

  • Evidence for: Labs are competing fiercely and self-regulation fails. CAIS's safetywashing paper shows safety benchmarks mostly measure compute. Hendrycks argues labs are "predetermined to race."
  • Evidence against: Technical alignment work at Anthropic, ARC, and others is producing real insights. If alignment is solvable through technical means, the geopolitical framing is a distraction.
  • Testable: If a lab achieves a significant alignment breakthrough that generalizes, this assumption weakens.
  • If wrong: CAIS has deprioritized the dimension that actually matters.

2. MAIM deterrence can be made stable.

  • Evidence for: Nuclear MAD has prevented great-power war for 80 years. States respond to perceived existential threats.
  • Evidence against: MIRI's critique shows red lines are unmonitorable, sabotage only delays, and the "breakout distance" is extremely short. Unlike nuclear weapons, AI capabilities can't be reliably observed. IAPS estimates only 25% chance MAIM dynamics actually emerge.
  • Testable: Whether states actually take MAIMing actions as AI advances over 2026-2030.
  • If wrong: CAIS's headline policy proposal fails.

3. A small, Hendrycks-centered org can have outsized influence.

  • Evidence for: The extinction statement was a genuine coup. MMLU became the industry benchmark. HarmBench was adopted by national AI Safety Institutes. 43K newsletter subscribers.
  • Evidence against: The xAI advisory role shows limits of personal influence without institutional leverage.
  • If wrong: CAIS produces papers and statements but doesn't change outcomes.

4. RepE/circuit breakers can substitute for mechanistic interpretability.

  • Evidence for: Circuit breakers survived 20,000 jailbreak attempts. RepE controls arbitrary concepts in model representations. Hendrycks makes a strong case that mech interp hasn't produced safety results after a decade.
  • Evidence against: Circuit breakers may have performance penalties. Mech interp may be necessary for novel failure modes that RepE can't anticipate.
  • If wrong: CAIS has contributed to the field underinvesting in the approach that works.

Strengths

  1. Research output per dollar. ~$7M/year in expenses producing multiple top-venue papers, the most widely used AI benchmark (MMLU), and safety tools adopted by national AI Safety Institutes. Few orgs match this efficiency.

  2. Hendrycks' intellectual versatility. From activation functions to benchmarks to safety tools to deterrence theory -- the breadth of high-quality output is remarkable. His empirical, anti-formalist approach produces practical tools others actually use.

  3. Compute cluster as force multiplier. Free compute for 350 researchers producing 109 papers and 4,000+ citations. Field-building through infrastructure, not just persuasion.

  4. Willingness to take intellectual risks. Opposing mech interp, proposing MAIM, working with Schmidt and Wang -- these generate productive debate. Most safety orgs are intellectually cautious.

  5. Cross-partisan reach. CAIS engages from Jacobin to AEI, from EA forums to the Eisenhower Building. This breadth of audience is rare in AI safety.

Weaknesses and Risks

  1. Single point of failure. CAIS = Dan Hendrycks. No evident succession plan, institutional depth, or visible second-in-command. Oliver Zhang is invisible. If Hendrycks is discredited or departs, the org has no evident continuity.

  2. The xAI credibility gap. Hendrycks advocates for safety standards while advising a company that conspicuously fails to meet them. Every Grok incident where Hendrycks is named as safety adviser undermines the broader safety message.

  3. Funding opacity. ~$6-7M of 2024 contributions are unidentified. Combined with Tallinn's funder-governance overlap and the OP exit with no explanation, independence is questionable. For an org demanding AI transparency, this is ironic.

  4. MAIM may not work. MIRI's critique is compelling. If deterrence logic doesn't hold for AI (unmonitorable thresholds, sabotage that only delays), CAIS's headline policy contribution produces no actionable governance.

  5. EA distancing without intellectual reckoning. Taking $19M from EA sources then declaring "EA does not equal AI safety" is strategic but raises integrity questions. Has the theory of change genuinely shifted, or is this reputation management?

  6. Research follows Hendrycks' interests, not systematic priority-setting. No visible process for identifying highest-priority research questions. This produces novel work but may leave important areas unfunded.

Cross-References

  • Versus Anthropic: Direct opposition on mechanistic interpretability. CAIS argues top-down (RepE) beats bottom-up (mech interp). Both can't be right about where the field should focus.
  • Versus MIRI: MIRI wrote the most rigorous MAIM critique but shares the existential risk concern. CAIS's pragmatic policy approach contrasts with MIRI's theoretical tradition.
  • Versus frontier labs: Hendrycks says labs "can't really choose not to race." CAIS works around labs through policy rather than through them, except for advisory roles that appear ineffective at changing lab behavior.
  • Versus PauseAI: MAIM is not a pause framework -- it allows development within deterrence constraints. More politically feasible but less ambitious on risk reduction.
  • Compute cluster vs. MATS/SERI: CAIS provides infrastructure; others provide mentorship. Complementary, but Schmidt Sciences restriction limits reach.

What Would Change This Assessment

  • If Hendrycks' xAI advisory produced observable improvements (system cards, circuit breaker adoption), that updates the advisory model's credibility substantially.
  • If OP explained its exit benignly (portfolio rebalancing, not values disagreement), that reduces concern about CAIS's direction.
  • If a state actually took a MAIMing action, it validates descriptive MAIM and increases CAIS policy urgency.
  • If tamper-resistant safeguards reached production quality with minimal performance penalty, that vindicates the RepE agenda.
  • If CAIS published board membership, full donor list, and conflict-of-interest policies, governance concerns would diminish.

Self-Critique

Sources I should have checked: The ChinaTalk podcast (audio only, no transcript) for the deepest China/AI discussion. The Pirate Wires article (paywalled) for the most detailed Gray Swan conflict account. Full 990 PDFs for board and program expense details.

Potential bias: I may overweight the xAI credibility problem because it's concrete and documented, while underweighting CAIS's genuine technical contributions. The MAIM critique section draws heavily from MIRI, which has its own intellectual commitments. An IR scholar might assess MAIM differently. I may also underestimate how effectively the national security repositioning mainstreams safety concerns.

A thoughtful disagree-er would say: "CAIS is exactly what the field needs -- pragmatic, empirically-grounded, and meeting the policy establishment where it is. The xAI role gives Hendrycks insight into how frontier labs work, which informs better policy. MAIM is the first AI governance proposal taken seriously by the national security community. And the research output per dollar is the best in the field."

Weakest claim: That the xAI advisory role evidences credibility failure. It's possible Hendrycks has significant private influence that doesn't show in public reporting. If he fought hard against the Grok 4 release and was overruled, that's very different from not trying.

Information that would most change my view: Internal evidence of Hendrycks' influence at xAI. Did he push for specific safety measures? Did he object to Grok 4's release? The answer determines whether the advisory role is meaningful engagement or performative association.

Connected to (10)

Encode Justicecollaborator · Nathan CalvinGray Swan AIspun off from · Dan Hendrycks, Andy ZouSchmidt Sciencescompute provider
Secure AI Projectstaff to · Nick Beckstead
Scale AIadvisor at · Dan Hendrycks
Open Philanthropystaff from · Nick Beckstead
Oraclecompute provider
xAIadvisor at · Dan Hendrycks
ML Alignment Theory Scholarsboard overlap · Oliver Zhang
UC Berkeleystaff from · Dan Hendrycks
Sources (93)
Every URL that was read during research.
  1. 1.About Us | CAISsafe.ai
  2. 2.Center for AI Safety - Wikipediaen.wikipedia.org
  3. 3.Dan Hendrycks - Wikipediaen.wikipedia.org
  4. 4.Work & Projects Summary | CAISsafe.ai
  5. 5.Careers at Center for AI Safety | CAISsafe.ai
  6. 6.Faster, Please! — The Podcast #75: Superintelligence and National Security: My Chat (+Transcript) with AI Expert Dan Hendrycksaei.org
  7. 7.Statement on AI Risk - Wikipediaen.wikipedia.org
  8. 8.Center for AI Safety (CAIS)weka.io
  9. 9.Center for AI Safety Action Fund (CAIS AF)action.safe.ai
  10. 10.roundups & rapid reactionssciencemediacentre.org
  11. 11.Superintelligence and national security: My chat (+transcript) with AI expert Dan Hendrycksfasterplease.substack.com
  12. 12.Elon Musk’s xAI safety whisperer joins Scale AI as an advisor | Fortunefortune.com
  13. 13.Compute Cluster | CAISsafe.ai
  14. 14.Dan Hendrycks - Avoiding an AGI Arms Race (The Trajectory Series 1: AGI Destinations Series, Episode 5) - Daniel Faggelladanfaggella.com
  15. 15.Superintelligence Strategynationalsecurity.ai
  16. 16.TIME100 AI 2023: Dan Hendryckstime.com
  17. 17.AI Risks that Could Lead to Catastrophe | CAISsafe.ai
  18. 18.On MAIM and Superintelligence Strategythezvi.substack.com
  19. 19.Eric Schmidt argues against a 'Manhattan Project for AGI' | TechCrunchtechcrunch.com
  20. 20.Dan Hendrycks wants to save us from an AI catastrophe. He’s not sure he’ll succeed. - The Boston Globebostonglobe.com
  21. 21.Why the collapse of Sam Bankman-Fried's FTX has split A.I. researchers | Fortunefortune.com
  22. 22.FTX probes $6.5M in payments to AI safety group amid clawback crusadecointelegraph.com
  23. 23.Superintelligence Strategy Response | Analysis of AI Safety & MAIMnationalsecurityresponse.ai
  24. 24.Former Google CEO Eric Schmidt sounds alarm over a ‘Manhattan Project’ for superintelligent AI | Fortunefortune.com
  25. 25.Center for AI Safety (CAIS) - InfluenceWatchinfluencewatch.org
  26. 26.GELU, MMLU, & X-Risk Defense in Depth, with the Great Dan Hendryckscognitiverevolution.ai
  27. 27.Superintelligence Deterrence Has an Observability Problemai-frontiers.org
  28. 28.Community Perspective - Dan Hendrycks - AI2050ai2050.schmidtsciences.org
  29. 29.Humanity's Last Exam - Wikipediaen.wikipedia.org
  30. 30.CAIS Compute Cluster¶cluster.safe.ai
  31. 31.AI Deterrence Is Our Best Optionai-frontiers.org
  32. 32.Refining MAIM: Identifying Changes Required to Meet Conditions for Deterrence - Machine Intelligence Research Instituteintelligence.org
  33. 33.Science of Trustworthy AI - Schmidt Sciencesschmidtsciences.org
  34. 34.AI Safety, Ethics, and Society | CAISsafe.ai
  35. 35.Who’s Funding AI Regulation and Safety?insidephilanthropy.com
  36. 36.Center for Artificial Intelligence Safetyfounderspledge.com
  37. 37.Safe and Secure Innovation for Frontier Artificial Intelligence Models Act - Wikipediaen.wikipedia.org
  38. 38.AISN #45: Center for AI Safety 2024 Year in Reviewnewsletter.safe.ai
  39. 39.Philosophy Fellowship 2023 | CAIS Projectsafe.ai
  40. 40.AI and Society Fellowship — Center for AI Safetysafe.ai
  41. 41.Center for AI Safety hosts DC launch event featuring CAIS’ Dan Hendrycks, Jaan Tallinn, Sen. Brian Schatz, Rep. French Hill, and CNN’s Pamela Brownwashingtonainetwork.com
  42. 42.Aboutcourse.mlsafety.org
  43. 43.AI Safety Is a Narrative Problemhdsr.mitpress.mit.edu
  44. 44.AI Safety, Ethics, and Society Textbookaisafetybook.com
  45. 45.How existential risk became the biggest meme in AItechnologyreview.com
  46. 46.Statement on SB-1047 and Founders | Gray Swan Newsgrayswan.ai
  47. 47.The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning | CAISsafe.ai
  48. 48.SB 1047: Our Side Of The Storyastralcodexten.com
  49. 49.Submit Your Toughest Questions for Humanity's Last Exam | CAISsafe.ai
  50. 50.The “AI Existential Risk” Industrial Complexaipanic.news
  51. 51.No Priors: Artificial Intelligence | Technology | Startups - National Security Strategy and AI Evals on the Eve of Superintelligence with Dan Hendrycks Transcript and Discussionpodscripts.co
  52. 52.Dan Hendrycksdanhendrycks.com
  53. 53.AI Safety Newsletter #49: Superintelligence Strategynewsletter.safe.ai
  54. 54.Frequently Asked Questions | CAISsafe.ai
  55. 55.The AI Safety Newsletter | Center for AI Safety | CAISsafe.ai
  56. 56.Navigating Transformative AIopenphilanthropy.org
  57. 57.The Nuclear-Level Risk of Superintelligent AItime.com
  58. 58.Lawfare Daily: Dan Hendrycks on National Security in the Age of Superintelligent AIlawfaremedia.org
  59. 59.SFF-2025 S-Process Recommendations Announcement | Survival and Flourishing Fundsurvivalandflourishing.fund
  60. 60.Devising ML Metrics | CAISsafe.ai
  61. 61.SFF-2024 S-Process Recommendations Announcement | Survival and Flourishing Fundsurvivalandflourishing.fund
  62. 62.About Gray Swangrayswan.ai
  63. 63.FTX Seeks Answers on Sam Bankman-Fried’s Payment to AI Nonprofit | PYMNTS.compymnts.com
  64. 64.Contact | CAISsafe.ai
  65. 65.2024 Impact Report | CAISsafe.ai
  66. 66.NeurIPS Workshop 2024mlsafety.org
  67. 67.Center For Artificial Intelligence Safety Inc - Nonprofit Explorer - ProPublicaprojects.propublica.org
  68. 68.Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?arxiv.org
  69. 69.Research Projects | CAISsafe.ai
  70. 70.AI Extinction Statement Press Release | CAISsafe.ai
  71. 71.Superintelligence Strategy: Expert Versionarxiv.org
  72. 72.AISN #28: Center for AI Safety 2023 Year in Reviewnewsletter.safe.ai
  73. 73.Can Humanity Survive AI?jacobin.com
  74. 74.Team: CAIS Action Fund (CAIS AF)action.safe.ai
  75. 75.Schmidt Sciences joins global research effort to safeguard AI - Schmidt Sciencesschmidtsciences.org
  76. 76.xAI Faces Criticism from AI Researchers Over Safety Practices and Grok Model Issuestheaiinsider.tech
  77. 77.How Not to Worry About AI: The Rebellion Against “Extinction”themachinerace.substack.com
  78. 78.Elon Musk released xAI’s Grok 4 without any safety reports—despite calling AI more ‘dangerous than nukes’ | Fortunefortune.com
  79. 79.Syllabuscourse.mlsafety.org
  80. 80.Crucial Considerations in ASI Deterrence — Institute for AI Policy and Strategyiaps.ai
  81. 81.AI and the threat of "human extinction": What's really going on here?salon.com
  82. 82.Nick Becksteadnickbeckstead.com
  83. 83.Scale AI and CAIS Unveil Results of Humanity’s Last Exam, a Groundbreaking New Benchmarkscale.com
  84. 84.Aboutsecureaiproject.org
  85. 85.Nick Becksteadsecureaiproject.org
  86. 86.CAIS and Scale AI Unveil Results of "Humanity's Last Exam," a Groundbreaking New Benchmarkprnewswire.com
  87. 87.Center for AI Safety (CAIS)safe.ai
  88. 88.The Misguided Quest for Mechanistic AI Interpretabilityai-frontiers.org
  89. 89.CAIS Blog | Center for AI Safetysafe.ai
  90. 90.A 3-person policy nonprofit that worked on California’s AI safety law is publicly accusing OpenAI of intimidation tactics | Fortunefortune.com
  91. 91.Everyone racing to adopt AI is claiming to be doing so ‘safely.’ This Pittsburgh startup wants to help companies actually follow through.post-gazette.com
  92. 92.Notes on a Jacobin article about AI killing everyonephilosophybear.substack.com
  93. 93.Statement on AI Risk | CAISsafe.ai