Beijing Institute of AI Safety and Governance (Beijing-AISI)

Governance

Chinese government approach.

Founded: 2024
HQ: Beijing, China
Structure: government research institute (China municipal)
Model: Government Contracts

Theory of Change

Beijing-AISI's stated theory of change emerges from its director Yi Zeng's writings and speeches:

AI poses both near-term and long-term existential risks to humanity. "In the long-term, we haven't given superintelligence any practical reasons why they should protect humankind." Current AI "has no real ability to understand and is not truly intelligent" but "will make mistakes that humans would not make in ways that are difficult to anticipate."
Current alignment approaches are insufficient: "The current approach for making AI models ethical is to bind them with rule-based ethical principles and align such intelligent information processing systems with human values and behaviors. This is like building a castle in the air."
The solution requires brain-inspired moral AI -- giving AI genuine understanding, cognitive empathy, and moral intuition rather than enforcing rules from outside. This "Super Co-alignment" framework envisions humans and AI co-evolving their values together.
International cooperation is "the only way to ensure that AI remains globally safe, reliable, and controllable." No country can manage AI safety in isolation.
Beijing-AISI exists to provide a municipal-level safety and governance institution, analogous to UK/US AISIs, for Beijing.

The institutional theory of change is: create a formal AI safety body that can (a) conduct safety evaluations, (b) produce technical benchmarks and tools, (c) engage internationally with AISI counterparts, and (d) influence Chinese domestic policy toward taking frontier AI risks seriously.

What They Do

Technical outputs: 5 GitHub repositories -- PandaGuard (jailbreak attack/defense framework evaluating 49 LLMs), ForesightSafety-Bench (94-dimension safety benchmark covering catastrophic and existential risks, evaluating 20+ models), CogToM (theory of mind benchmark), plus Foresight ClawAudit (security tool for AI agent frameworks, March 2026). Several arXiv papers on brain-inspired moral AI and "Super Co-alignment."

International engagement: Yi Zeng has briefed the UN Security Council, contributed to the International AI Safety Report 2025, served on the UN High-Level Advisory Body on AI and UNESCO expert groups, co-signed the IDAIS-Beijing statement proposing 5 AI red lines, and signed both the CAIS extinction risk statement and the Pause Giant AI Experiments letter. Beijing-AISI hosted UK AISI for a bilateral meeting in October 2024.

Domestic positioning: Beijing-AISI is one of two municipal AI safety bodies (the other is Shanghai's, established July 2024). It is part of a network of Yi Zeng-led entities including the Center for Long-term AI, the Beijing Key Laboratory of Safe AI and Superalignment, and the Chinese AI Safety Network.

What it does not do (as far as publicly known): Conduct or publish safety evaluations of Chinese frontier AI models. Exercise any regulatory authority over AI developers.

Key People

Yi Zeng (Founding Dean/Director): Professor at CAS Institute of Automation. TIME100 AI 2023. Signed CAIS extinction risk statement and Pause letter. Briefed UN Security Council. Led drafting of Beijing AI Principles (2019). His career began when he saw Spielberg's A.I. as a student and resolved to "build a robot that can love the human species." He holds views on AI existential risk closely aligned with the Western x-risk community -- a rare position for a Chinese government-affiliated scientist. Recommended by Jaan Tallinn as having "good ideas about AGI and governance."

Wei Kai (Deputy Director): Director of CAICT AI Research Institute (under MIIT). CAICT is a major player in China's AI evaluation ecosystem and a CnAISDA member institution.

Team size: Unknown. Appears to leverage existing CAS researchers rather than having dedicated staff.

Money and Incentives

Funding: Entirely from Beijing municipal government via Chinese Academy of Sciences infrastructure. Zero Western philanthropic funding. Zero Coefficient Giving grants.

Budget: Unknown. No dedicated budget, staff, or separate funding appears to exist. For comparison, UK AISI operates on approximately $68M/year with 100+ FTE. CnAISDA (the national network) similarly has no dedicated resources.

Business model: Government research institute. No independent revenue, no product sales, no grants from international funders.

Incentive structure: Beijing-AISI operates within a system where the CCP's primary AI objective is economic growth and technological self-sufficiency. The mantra "failing to develop is the greatest threat to security" dominates Chinese AI policy. The CCP's support for AI safety bodies likely stems primarily from aspirations for global participation and diplomatic positioning rather than deeply held safety concerns. There is a fundamental tension between Yi Zeng's genuine concern about catastrophic AI risk and the CCP's primary interest in political content control and economic development.

Who pays for safety in China: The CAC (Cyberspace Administration of China) controls the binding AI standards. The March 2025 standards require testing AI models for threats to "core socialist values" and "national unity" -- political security is the top priority, not frontier AI risk. The same companies that signed voluntary safety commitments also had input into these political security standards. No Chinese frontier AI company has fulfilled its Seoul safety commitments.

What Others Say

China Media Project (strongest critic): "China's first priority is control for political ends." Testing chatbots on sensitive topics (Uyghur cultural preservation, Taiwan democratization) reveals CCP-aligned censorship built into every approved model. "Are we really on the same page as China when it comes to AI safety?" The answer, they argue, is no -- Chinese "AI safety" primarily means CCP information control.

Carnegie Endowment (most thorough analyst): "China is hoping it can have its cake and eat it too." Real evolution on catastrophic risks in Framework 2.0 (CBRN, loss of control), but "control over politically sensitive content has been the core driver of China's binding AI regulations." CnAISDA is "a pivotal moment" but "has yet to translate engagement into substantive AI safety-oriented domestic policies."

AI Frontiers (balanced assessment): CnAISDA "has so far taken little substantive action to address potentially global-scale risks. The truest test will be whether it can anchor a system-wide shift inside China toward not just speaking about frontier AI risks, but taking action to reduce them."

FLI AI Safety Index (Dec 2025): Chinese firms DeepSeek and Alibaba Cloud scored worst among all tested companies (D grades). No Chinese company had a publicly available safety framework.

Jack Clark (Anthropic co-founder): Argued "China cares about the same safety risks as us." China Media Project's response: "This belief deserves caution -- and context."

What's Absent

No published safety evaluation results from Beijing-AISI on any Chinese frontier model
No concrete budget, headcount, or staffing information
No Chinese-language website (English only) -- institution is outward-facing
No evidence of regulatory authority over any AI company
No explanation for Beijing-AISI's exclusion from CnAISDA membership
No evidence of formal relationships with Chinese frontier AI labs
No public criticism of CCP AI policy from Yi Zeng, despite his privately held x-risk views
No forum discussion in Western AI safety communities (zero LessWrong/EA Forum posts)
No non-Yi Zeng researcher publishing under Beijing-AISI affiliation

Stated Theory of Change

Beijing-AISI's theory of change, as articulated by Yi Zeng, operates on two levels:

Technical level: Current alignment approaches (RLHF, rule-based constraints) are "building a castle in the air" because they lack moral understanding. The solution is brain-inspired moral AI that gives AI systems genuine self-awareness, cognitive empathy, and moral intuition. This "Super Co-alignment" framework envisions humans and AI co-evolving their values together toward a "Sustainable Symbiotic Society." This is a fundamentally different paradigm from the dominant Western alignment approaches.

Institutional level: Create a Beijing-based AI safety institute that can conduct safety evaluations, produce technical benchmarks, engage internationally with other AISIs, and influence Chinese domestic policy to take frontier AI risks seriously. International cooperation is "the only way" -- no country can manage AI safety alone.

The causal chain is: (1) Build institutional legitimacy through international engagement and technical output, (2) Use that legitimacy to shape domestic Chinese AI governance toward genuine safety, (3) Contribute to global coordination on frontier AI risks.

Revealed Theory of Change

Examining what Beijing-AISI actually does reveals a significant gap between aspiration and capacity:

What the institution prioritizes: International engagement and visibility. Yi Zeng participates in nearly every major international AI safety forum. Beijing-AISI's website is English-only. The institution hosted UK AISI. The technical outputs (PandaGuard, ForesightSafety-Bench) are published in English on international platforms.

What the institution does not do: Publish safety evaluations of Chinese frontier models. Exercise regulatory authority. Maintain a Chinese-language public presence. Operate with dedicated staff or budget.

Where this diverges from stated theory: The stated theory requires domestic influence on Chinese AI development. The revealed practice is almost entirely internationally-facing. The institution appears designed to represent China abroad rather than to constrain Chinese AI development at home. This mirrors CnAISDA's design, which Carnegie describes as prioritizing "international representation over domestic functions."

The optimistic reading: Yi Zeng is playing a long game. International legitimacy is being strategically accumulated to eventually leverage domestic influence. The ForesightSafety-Bench embedding catastrophic and existential risk dimensions is evidence that x-risk framing is being quietly normalized within Chinese institutional outputs. The IDAIS red lines and Politburo study sessions show the safety conversation IS reaching the top.

The pessimistic reading: Beijing-AISI is an institutional shell with one genuinely concerned scientist at its center, no resources, no mandate, and no ability to influence the companies whose models actually need safety evaluation. The CCP values it primarily as a diplomatic asset for international engagement, not as a safety regulator.

Key Assumptions

Assumption 1: International engagement leads to domestic policy change.

Evidence for: The Third Plenum decision (July 2024) called for AI safety oversight. The Politburo study session (April 2025) warned of "unprecedented risks." The AIIA safety commitments mirror international frameworks. Carnegie argues ideas are "diffusing" from international into domestic context.
Evidence against: CAC's binding standards still prioritize political content control. No Chinese company has fulfilled Seoul safety commitments. CnAISDA has no domestic functions.
Testable? Yes -- watch whether China creates binding frontier AI safety regulations (not content regulations) in the next 1-2 years.
If wrong: Beijing-AISI becomes permanently internationally-facing with no domestic impact.

Assumption 2: Brain-inspired moral AI is a viable path to alignment.

Evidence for: Yi Zeng's approach draws on neuroscience, moral development theory, and philosophy. It addresses genuine limitations of RLHF (superficial alignment without understanding).
Evidence against: No working prototype. The field of brain-inspired AI has existed for decades without producing aligned systems. The "Super Co-alignment" framework is highly speculative. Western alignment researchers would likely view it as insufficiently concrete.
Testable? Only on very long timescales. This is a research direction, not a near-term solution.
If wrong: Beijing-AISI's distinctive intellectual contribution is irrelevant to the near-term alignment problem.

Assumption 3: Yi Zeng's genuine x-risk concerns can survive and influence within the CCP system.

Evidence for: He has sustained this position for years while rising to major advisory roles. The CCP has tolerated and even elevated his engagement.
Evidence against: He cannot publicly criticize CCP AI policy. His institution was excluded from CnAISDA. The CCP's actual priorities (content control, economic development) are unchanged.
Testable? Yes -- if Beijing-AISI or Yi Zeng ever publicly challenge a specific Chinese AI development decision, that would be strong evidence.
If wrong: Yi Zeng remains a useful diplomatic figure with no real policy influence.

Assumption 4: China-West cooperation on AI safety is possible despite geopolitical tension.

Evidence for: IDAIS dialogues produced substantive joint statements. Chinese scientists participate in international safety reports. UK AISI engaged with Beijing-AISI.
Evidence against: BAAI is on the US Entity List. The Trump administration's AISI was deprioritized. US-China tensions are structural and escalating. China did not join INAISI.
Testable? Watch whether any binding technical cooperation agreement emerges.
If wrong: Beijing-AISI's international engagement function becomes symbolic.

Strengths

Yi Zeng is genuinely concerned about AI risk in a way that is rare among government-affiliated Chinese scientists. His signing of the CAIS statement and Pause letter, his UN Security Council briefing, and his candid translated interviews all suggest authentic commitment. This is not performative.
Unique bridging position. Beijing-AISI is one of very few institutions that can credibly engage both the Western AI safety community and the Chinese government. This bridging function has irreplaceable value for global AI governance.
Technical outputs have genuine substance. ForesightSafety-Bench's inclusion of catastrophic and existential risk dimensions is meaningful. PandaGuard is a serious jailbreak research framework. These are not empty gestures.
The broader Chinese AI safety ecosystem is growing. Chinese researchers published approximately 26 frontier safety papers per month in 2025 (doubled from prior year). Superalignment and mechanistic interpretability are now popular research topics. Beijing-AISI is part of a rising tide.
Track record of shaping governance documents. Yi Zeng's successful drafting of the Beijing AI Principles (2019) and participation in multiple UN processes demonstrates ability to influence governance frameworks.

Weaknesses and Risks

Institutional shell with no operational capacity. No dedicated budget, staff, or regulatory authority. Beijing-AISI is Yi Zeng's label on his existing CAS research group. If Yi Zeng leaves or loses CCP favor, the institution has no independent existence.
Excluded from the national body. Being left out of CnAISDA means Beijing-AISI is not part of China's official AISI network. This limits its influence on national policy.
The "AI safety = CCP content control" problem is structural. Whatever Yi Zeng personally believes, the Chinese regulatory infrastructure defines and implements "AI safety" as political content security first. The March 2025 CAC standards, which are binding, require testing for threats to "core socialist values" -- not for autonomous replication, power-seeking, or the other IDAIS red lines.
No leverage over frontier labs. DeepSeek, Alibaba, and other Chinese frontier developers answer to the CCP, not to academic safety institutes. Beijing-AISI cannot require or compel any safety evaluation.
Brain-inspired moral AI is a decades-long research agenda. Even if theoretically correct, it cannot address near-term risks from current and next-generation frontier models.
International engagement faces structural barriers. BAAI on the Entity List, US-China tensions escalating, the Trump administration deprioritizing its own AISI -- the geopolitical environment is hostile to the cooperation that Beijing-AISI's theory of change requires.

Cross-References

UK AISI / US AISI: Beijing-AISI aspires to be China's equivalent, but with a fraction of the resources. UK AISI has $68M/year and 100+ FTE. Beijing-AISI has no dedicated budget or staff. The comparison reveals how early-stage China's institutional AI safety capacity is.

CnAISDA: The national AISI network that notably excludes Beijing-AISI. CnAISDA is Tsinghua-centric (led by Andrew Yao, Xue Lan, Fu Ying) while Beijing-AISI is CAS-based (Yi Zeng). Understanding this institutional rivalry is essential to understanding Beijing-AISI's position.

Concordia AI: A Beijing-based social enterprise that publishes the "AI Safety in China" newsletter and "State of AI Safety in China" report. Concordia AI is the key Western-accessible observer of China's AI safety ecosystem and has partnered with Carnegie on key analyses. It is the entity that translated Yi Zeng's interview.

Chinese frontier labs (DeepSeek, Alibaba, etc.): The entities whose models need safety evaluation. All scored poorly on FLI's safety index. None have fulfilled Seoul safety commitments. The gap between these companies' practices and the aspirations of Beijing-AISI/CnAISDA is the core challenge.

What Would Change This Assessment

Upward revision:

Beijing-AISI publishes safety evaluation results for a Chinese frontier model (would demonstrate real capability)
Beijing-AISI receives dedicated budget and staff (would demonstrate CCP commitment)
Beijing-AISI joins CnAISDA or is given a national mandate (would demonstrate institutional acceptance)
China creates binding frontier AI safety regulations distinct from content control (would validate the policy diffusion theory)
A major AI incident in China triggers regulatory response that empowers safety institutions

Downward revision:

Yi Zeng leaves Beijing-AISI or loses CCP backing
CnAISDA remains permanently symbolic with no domestic function
US-China tensions fully preclude cooperation on AI safety
Chinese frontier labs race ahead without any meaningful safety constraints
Beijing-AISI's technical outputs stop or are revealed to exclude politically sensitive dimensions

Self-Critique

What sources should I have checked but could not? Chinese-language primary sources (WeChat posts, Chinese media coverage, government documents in Chinese) are entirely absent from this analysis. I am dependent on English translations and Western analytical sources. This introduces a systematic bias toward how Western analysts frame Chinese AI safety.

Where is this analysis potentially biased? I may overstate Yi Zeng's influence because he is the most internationally visible Chinese AI safety figure. Within China, the Tsinghua-based figures (Andrew Yao, Xue Lan) may have far more actual policy influence. I also may underweight the CCP's genuine concern about AI risk because the evidence for political content control is more vivid and concrete.

What would a thoughtful person who disagrees say? An optimist would argue that I underestimate the significance of the Third Plenum decision, the Politburo study session, and the growing body of Chinese frontier safety research. These are genuine signals of a system that is evolving. The pessimist would argue I am too generous -- Beijing-AISI is a diplomatic fig leaf with zero safety impact, and China will never prioritize frontier AI safety over economic development and political control.

Single weakest claim: My assessment of Yi Zeng's genuine concern about existential risk is based on translated interviews and international engagements. I cannot rule out the possibility that his international positioning is strategically calculated for career advancement. However, the consistency of his views over many years, across multiple contexts, makes genuine concern the most parsimonious explanation.

What information would most change my view? (1) Published safety evaluations of Chinese frontier models by Beijing-AISI, with results that identify real risks. (2) Evidence that Yi Zeng has ever privately or publicly pushed back against a CCP AI policy decision. (3) Budget and staffing data showing dedicated resources. Any of these would significantly increase my confidence in Beijing-AISI's operational reality.

Connected to (8)

Center for Long-term AIstaff from · Yi Zeng Chinese Academy of Sciences (Institute of Automation)staff from · Yi Zeng Peking University (PAIR Lab)collaborator · Yaodong Yang

China AI Safety and Development Associationadvisor at · Yi Zeng

China Academy of Information and Communications Technologystaff from · Wei Kai

Concordia AIcollaborator

International Dialogues on AI Safetycollaborator · Yi Zeng

UK AI Security Institutecollaborator

Sources (38)

Every URL that was read during research.

1.What do we know about China’s new AI safety institute? - DigiChinadigichina.stanford.edu
2.How Some of China’s Top AI Thinkers Built Their Own AI Safety Institutecarnegieendowment.org
3.Chinese AI Safety Institute Counterparts — Institute for AI Policy and Strategyiaps.ai
4.Where’s China’s AI Safety Institute?chinatalk.media
5.Is China Serious About AI Safety?ai-frontiers.org
6.Newsaiig.tsinghua.edu.cn
7.Chinese AI Safety Networkchinese-ai-safety.institute
8.Promoting Global AI Safety and Governance Capacity-building through International Cooperationai-ethics-and-governance.institute
9.Yi ZENGscai.gov.sg
10.Beijing Institute of AI Safety and Governancegithub.com
11.Yi ZENG — Chinese Perspectives on AI Safetychineseperspectives.ai
12.Yi Zeng - Exploring 'Virtue' and Goodness Through Posthuman Minds [AI Safety Connect, Episode 2] - Daniel Faggelladanfaggella.com
13.PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacksarxiv.org
14.Beijing research team releases safety detection tool for OpenClawenglish.news.cn
15.Super Co-alignment of Human and AI for Sustainable Symbiotic Societyarxiv.org
16.How China Views AI Risks and What to do About Themcarnegieendowment.org
17.AI Safety in China: 2024 in Reviewaisafetychina.substack.com
18.State of AI Safety in China (2025) Report Releasedaisafetychina.substack.com
19.AI Safety in China #25aisafetychina.substack.com
20.AI Safety in China #19aisafetychina.substack.com
21.TIME100 AI 2023: Yi Zengtime.com
22.Yi Zeng (AI researcher) - Wikipediaen.wikipedia.org
23.China’s Views on AI Safety Are Changing—Quicklycarnegieendowment.org
24.ForesightSafety Bench: A Frontier Risk Evaluation and Governance Framework towards Safe AIarxiv.org
25.China Is Taking AI Safety Seriously. So Must the U.S.time.com
26.In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?carnegieendowment.org
27.Concordia AI - Guiding the governance of AI for a long and flourishing futureconcordia-ai.com
28.Why does Beijing suddenly care about AI ethics?technologyreview.com
29.AIStackDecrypted | Paul Triolo | Substackpstaidecrypted.substack.com
30.China Al Safety \u0026amp; Development Association-0 Homecnaisi.cn
31.Opinion | Cooperation for AI safety must transcend geopolitical interferencescmp.com
32.How China Sees AI Safety - China Media Projectchinamediaproject.org
33.Fu Ying’s Remarks on AI Safety and Governance at the Paris AI Summitciss.tsinghua.edu.cn
34.IDAIS-Beijing - International Dialogues on AI Safetyidais.ai
35.Notes from the Paris AI Action Summitgeopolitechs.org
36.Chinese Firms Fare Worst As Study Finds AI Majors Fail Safety Testasiafinancial.com
37.AI Safety in China #17aisafetychina.substack.com
38.Building Altruistic and Moral AI Agent with Brain-inspired Emotional Empathy Mechanismsarxiv.org