Alignment Research Engineer Accelerator (ARENA)

Field-Building

Alignment engineer bootcamp. Open-source.

Founded: 2022
HQ: London, UK
Team: 4
Structure: charity (UK)
Model: Grants

Theory of Change

ARENA's stated theory of change is that producing more technically skilled ML engineers for AI safety reduces existential risk from AI. In their own words: "We are concerned that powerful AI poses an existential risk. We believe that more high-quality work in fields such as interpretability, LLM evaluations, and adversarial robustness can help alleviate this problem. Through our work, we hope to reduce the existential risk that advanced AI poses to humanity." (Mission page)

The mechanism is a 4-5 week intensive in-person bootcamp that teaches ML engineering skills (building transformers from scratch, mech interp with TransformerLens, RL including RLHF, LLM evaluations using UK AISI Inspect), integrates participants into the AI safety community via the LISA co-working space, and accelerates career transitions into safety roles.

ARENA explicitly positions itself as a pipeline intermediate -- upstream of research programs like MATS and Astra, downstream of foundational courses like BlueDot. Georg Lange (MATS reviewer) categorizes ARENA as an "upskilling program" -- a prerequisite step before elite research fellowships.

What They Do

ARENA runs 2-3 in-person ML bootcamps per year at LISA in Shoreditch, London. Each cohort: 27-33 participants from 280-370 applications (~8-11% acceptance rate). Eight cohorts completed since late 2022; cohort 8.0 planned for May-June 2026.

Curriculum: Week 0 (optional fundamentals), Week 1 (transformers & mech interp), Week 2 (RL), Week 3 (LLM evaluations), Week 4 (capstone projects). Evals exercises developed with Apollo Research using UK AISI's Inspect library. Major expansion for 8.0 adds entirely new chapter on alignment science (emergent misalignment, interpreting reasoning models, LLM psychology, investigator agents) plus AI control content.

Open-source curriculum: GitHub repository has 916 stars, 581 forks. Adapted independently by at least 5 programs worldwide: TARA (APAC -- Australia/Philippines/Japan/Singapore), Finnish Alignment Engineering Bootcamp, Zurich AI Safety, Stanford SAIA, Cambridge CBAI. The open-source curriculum may be ARENA's most scalable contribution.

Satisfaction scores: 9.1-9.7/10 across measured cohorts (4.0-7.0), trending upward. LISA environment rated 9.6-9.9/10. Participants estimate they would need ~9-10 weeks to self-study the 4-week taught content.

Career outcomes: At each cohort's end, typically 4 participants have confirmed safety positions, ~15 are actively applying. Named alumni placements include Apollo Research, Anthropic, METR, UK AISI, GovAI, OpenAI DC Evals, MATS, LASR, Pivotal, Google DeepMind. One ARENA 2.0 participant was hired by OpenAI DC Evals; another reached DeepMind final rounds.

All outcome data is self-reported. No independent evaluation or longitudinal tracking exists.

Key People

Callum McDougall -- Founder, now Strategy & Curriculum Developer. Cambridge MMath, Jane Street intern, Anthropic interp team (~6 months), now Google DeepMind interp researcher. Ran the first 3 ARENA iterations personally. Authored the new Chapter 4 (alignment science) for ARENA 8.0. Has stepped back from operational leadership to a TA/advisory role.

James Hindmarch -- Programme Lead since ARENA 5.0. Cambridge BA+MMath in logic. SERI MATS participant (John Wentworth). Built the evals curriculum. Full-time operational leader.

David Quarel -- Head TA. PhD student at ANU under Marcus Hutter. Consistently praised by participants as an exceptional teacher.

Core team of ~3-4 full-time staff plus 5-8 rotating TAs per cohort. Planning to hire a full-time content developer in 2026.

Money and Incentives

Funding (ARENA direct): $1,435,258 total from Coefficient Giving (Open Philanthropy) across 4 grants. Trajectory: $18,800 (Feb 2023) -> $98,186 (Nov 2023) -> $318,272 (Jul 2024) -> $1,000,000 (Jan 2025). A ~53x scaling in 2 years.

Funding (LISA infrastructure): ~$3.2M total from CG/OP across 3 grants supporting the co-working space and operations that ARENA uses. Combined ARENA + LISA funding from CG: approximately $4.6M.

Single-funder dependency: ARENA's sole funder is CG/Open Philanthropy. No grants from SFF, EA Funds, Manifund, or any other source were found. This is 100% funder concentration -- an existential vulnerability. If CG changes priorities, ARENA has no backup.

Legal structure: ARENA has no separate legal entity. It operates under LISA (Safe AI London Ltd, company 14942848, charity 1211693 registered Jan 2025). LISA was formerly "MATS London Ltd" (renamed Oct 2023).

What participants get: Travel, visas, accommodation, meals, compute -- all covered. No stipend or income replacement. No per-participant cost breakdown is public. TARA's remote adaptation cost $899 AUD/participant; ARENA's London-based full-costs model is likely $5K-10K per participant.

No earned revenue. ARENA does not charge participants, sell products, or generate any non-grant income.

What Others Say

The signaling critique: A MATS research manager (former ARENA participant and TA) published "A Proposal for a Better ARENA" (January 2026) arguing: "The biggest benefit of the current program to the AI Safety community appears to be as a signaling mechanism for other programs." Evidence: many participants have already done AI safety research before ARENA; compressed 2-week versions (ARBOx) produce people who go on to elite fellowships. Proposes restructuring into research sprints instead of contained exercises. Author later conceded scalability is not a primary benefit and that removing structured learning could reduce diversity.

The bottleneck critique: "AI Safety Talent Needs in 2026" (MATS Research, 23 hiring manager interviews) finds: "A major bottleneck for technical organizations is the scarcity of senior people who can effectively supervise and mentor others. Even when funding exists, each senior researcher can only mentor a limited number of juniors before quality degrades." This directly challenges whether producing more junior talent is the right intervention. The same report found that "direct collaboration experience with a calibrated reference" is the dominant hiring signal -- which validates ARENA's community-integration goal even as it undercuts the pure upskilling argument.

The accessibility critique: TARA (APAC adaptation) notes that ARENA "requires some participants to quit jobs, relocate internationally, and commit months of uninterrupted time -- extremely difficult for professionals with careers, families, or financial obligations." ARENA reaches ~75-90 people/year at high per-person cost.

The curriculum gap: Leon Lang (ARENA 5.0 participant) observed: "While the program is called 'Alignment Research Engineering Accelerator,' there is actually almost no content on how to align AI systems." This is being addressed in the ARENA 8.0 curriculum expansion.

The automation threat: The 2026 talent needs report finds "senior engineers prove far more effective at using AI tools than junior engineers," suggesting automation may "raise rather than lower the bar for human contribution" -- potentially reducing the value of junior engineering skills that ARENA teaches.

Community reception: All ARENA-specific criticism found was constructive and came from within the AI safety community. No hostile external criticism exists, likely because ARENA is too small to attract broader attention. The EA Forum and LessWrong reception of impact reports and call-for-applicants posts is consistently positive.

What's Absent

No independent evaluation of whether ARENA alumni actually perform better in safety roles than counterfactual candidates.
No longitudinal tracking of alumni beyond immediate post-program outcomes.
No public budget breakdown showing per-participant costs.
No long-form interview where the founder speaks candidly about ARENA's strategy and limitations.
No negative participant testimonials in the public record (either remarkable or unrepresentative).
No data on what fraction of ARENA alumni end up in capabilities vs. safety roles.
No public explanation for the simultaneous departure of 3 founding LISA directors in July 2024.
ARENA 1.0 and 3.0 have no impact reports.

Stated Theory of Change

ARENA's stated causal chain: (1) AI poses existential risk, (2) more high-quality work in interpretability, evaluations, and adversarial robustness can reduce this risk, (3) there is a shortage of people with the ML engineering skills to do this work, (4) ARENA provides intensive training that produces these people more efficiently than self-study, (5) LISA co-location integrates these people into the safety community, accelerating their career transitions, (6) more skilled safety engineers working at organizations like Apollo, Anthropic, METR, and UK AISI collectively reduce AI risk.

The mechanism is specific: 4-5 weeks of pair programming through a structured curriculum (transformers from scratch, mech interp, RL/RLHF, evals), plus community integration at LISA, plus career acceleration through talks, networking, and capstone projects. ARENA explicitly targets the transition from "motivated and technically capable person" to "technically proficient AI safety practitioner ready for research programs or direct employment."

Revealed Theory of Change

ARENA's actions broadly align with its stated theory, but the revealed theory has some interesting divergences:

The program is more community on-ramp than skill bootcamp. When asked "What was the most valuable thing you gained?", participants across cohorts consistently cite personal connections and community belonging as highly as ML skills. In ARENA 7.0, "making personal connections" (26%) was comparable to "ML skills" (30%). Participants rate the LISA environment (9.9/10) higher than the exercises (8.7/10). The revealed theory of change may be: ARENA provides talented people with both a credential and a network that gets them into the AI safety ecosystem, where they then find their own path forward.

The open-source curriculum does more scaling than the in-person program. ARENA reaches ~75-90 people per year in-person. But the GitHub repository has 916 stars and 581 forks, and at least 5 independent programs worldwide adapt the materials. TARA trained 19 people at $899/person. The Finnish bootcamp trained 12 people on a budget of under $10K. ARENA's most scalable contribution is curriculum-as-public-good, not seats at LISA.

The founder moved to frontier labs. Callum McDougall went from founding ARENA to working at Anthropic (~6 months) and then DeepMind. He maintains curriculum involvement (authoring Chapter 4) but has stepped back from operations. This reveals something about career incentives: even the person who built an AI safety field-building program found that the best personal career move was joining a frontier lab. This is not a criticism -- contributing to interp research at DeepMind may be more impactful than running a bootcamp -- but it undercuts the notion that ARENA-type career paths are sufficient endpoints.

Curriculum evolution tracks community demand, not independent research agenda. ARENA added evals when evals became important (Apollo collaboration, UK AISI Inspect). It is adding alignment science and AI control content for 8.0 as these topics gain prominence. This is responsive and practical, but it means ARENA is a follower of the field's priorities, not a setter.

Key Assumptions

Assumption 1: The bottleneck for AI safety is skilled junior talent.

Evidence for: In 2021-2023, multiple community figures (Buck Shlegeris, the MLAB rationale, 80K Hours) identified engineering skills as a key bottleneck. Recruiters from OpenAI and Anthropic reached out to ARENA 2.0 asking about graduates.
Evidence against: By 2026, the MATS Research talent needs analysis (23 interviews) finds the bottleneck is senior mentorship capacity, not junior talent supply. Organizations are "hyper-selective" because they can't mentor enough people, not because good people don't exist.
Is it testable? Partially -- if the AI safety job market absorbs all of ARENA's graduates quickly, the bottleneck hypothesis holds. If many graduates struggle to find positions, it doesn't.
What changes if wrong? If senior mentorship is the real bottleneck, then ARENA should pivot toward mentorship capacity-building (training more research managers, not more junior engineers). The research sprints proposal from the "Better ARENA" post moves in this direction.

Assumption 2: Structured exercises produce better alignment engineers than self-study.

Evidence for: Participants estimate 9-10 weeks of self-study for 4 weeks of ARENA content (~2.5x acceleration). Pair programming and TAs reduce getting-stuck time. Finnish bootcamp found 41.8% average likelihood of completing the curriculum independently.
Evidence against: The skills taught in contained exercises "remove the crucial element of uncertainty and decision-making present in real research" (Proposal for a Better ARENA). ARENA may produce people who can follow exercises but not initiate and execute research independently.
Is it testable? Yes -- comparing research output quality of ARENA alumni vs. people who self-studied the same materials would directly test this.
What changes if wrong? ARENA should restructure toward research sprints, open-ended projects, and mentorship rather than structured exercises. The capstone week already moves in this direction but is only 1 of 5 weeks.

Assumption 3: ARENA graduates pursue AI safety rather than capabilities.

Evidence for: Selection criteria explicitly prioritize "likelihood to pursue AI safety rather than general AI development." Pre-program AI safety career confidence averages 7.7-8.3/10 (already very committed). Post-program confidence increases by 0.4-0.7 points on average.
Evidence against: No data exists on what fraction of alumni are in safety vs. capabilities roles 1-2 years later. Some alumni testimonials mention roles at Google DeepMind, Meta, and other capabilities-adjacent organizations. The skills taught (ML engineering, transformer architecture, RL) are directly applicable to capabilities work.
Is it testable? Yes, through longitudinal tracking. ARENA could survey alumni 1 and 2 years out.
What changes if wrong? If a large fraction of alumni end up in capabilities, ARENA's theory of change is partially undermined -- it would be training people for general AI development at AI safety community expense.

Assumption 4: In-person at LISA is worth the cost premium over remote/distributed alternatives.

Evidence for: LISA ratings of 9.6-9.9/10. Multiple participants say "I wouldn't have succeeded remotely." The Finnish remote bootcamp found pair programming was "really hard" online. Community connections are the #1 or #2 most valued outcome.
Evidence against: TARA achieved 9.43/10 satisfaction with a part-time local format at 1/10th the cost. Remote self-study is free. The in-person model limits throughput to ~75-90/year.
Is it testable? A randomized experiment (in-person vs. remote with same curriculum) would test this, but is impractical.
What changes if wrong? If remote/hybrid can capture 80%+ of the value at 20% of the cost, ARENA should shift resources toward supporting satellite programs rather than expanding London cohorts.

Strengths

Operational excellence. Across 7+ iterations, ARENA consistently delivers very high satisfaction (9.1-9.7/10), well-structured curriculum, effective logistics, and responsive iteration based on feedback. The pair-programming matching algorithm and daily feedback systems show genuine process sophistication.
Curriculum as public good. The open-source curriculum (916 stars, 581 forks) has been independently adapted by 5+ programs worldwide. This multiplier effect means ARENA's impact extends far beyond its London cohorts.
Community integration via LISA. Hosting at LISA provides genuine access to working AI safety researchers (Apollo, LASR, MATS extension, independent researchers). The 9.9/10 LISA rating is not mere satisfaction inflation -- participants describe specific, concrete networking benefits.
Demonstrated placement pipeline. Alumni at Apollo Research, Anthropic, METR, UK AISI, GovAI, and others provide evidence that the pipeline functions. ARENA's brand is recognized by downstream programs (MATS, Astra) as a legitimate credential.
Responsive curriculum evolution. Adding evals (2024), collaborating with Apollo and UK AISI, and now adding alignment science and AI control for 8.0 shows willingness to evolve with the field rather than stagnate.
Scaling funder confidence. A ~53x increase in CG grant size over 2 years ($18.8K to $1M) is a strong vote of confidence from the most important AI safety funder.

Weaknesses and Risks

Existential single-funder dependency. 100% of funding comes from CG/Open Philanthropy. No SFF, EA Funds, or other funders. No earned revenue. No diversification strategy is apparent. If CG reduces AI safety funding, ARENA ceases to exist.
May be solving the 2021 problem, not the 2026 problem. The talent needs analysis finds that the bottleneck has shifted from "not enough skilled juniors" to "not enough senior mentors to absorb juniors." If this is correct, producing more bootcamp graduates may yield diminishing returns. The strongest version of this critique: ARENA produces people who then can't find positions because the ecosystem is mentorship-bottlenecked.
No independent outcome evaluation. All outcome data is self-reported satisfaction and confidence scores. No third party has assessed whether ARENA alumni actually perform better in safety roles than a counterfactual group. Without this, the claimed impact is unverifiable.
Signaling vs. genuine upskilling. The "Proposal for a Better ARENA" argues persuasively that ARENA's primary function is as a credentialing/signaling mechanism. If participants already have the skills to succeed (many come with prior AI safety research experience), ARENA provides the brand and network, not the skills. This is not worthless -- networks and credentials are genuinely valuable -- but it is different from what the theory of change claims.
Automation risk to the skills taught. The 2026 talent needs analysis finds that "senior engineers prove far more effective at using AI tools than junior engineers" and expects "reduced demand for junior technical execution roles" within 2-5 years. If AI tools commoditize the engineering skills ARENA teaches, the program's value proposition erodes.
No separate legal entity or independent governance. ARENA operates entirely under LISA with no board representation for ARENA staff. ARENA's continued existence depends on LISA's board, which is not ARENA-specific. The simultaneous departure of 3 founding LISA directors (unexplained) adds governance uncertainty.
Accessibility limitations. No stipend (unlike MATS), requirement to be in London for 5 weeks, and ~8-11% acceptance rate limit who can participate. The program structurally favors those with financial cushion and geographic flexibility.

Cross-References

MATS: ARENA is explicitly positioned as a feeder program for MATS. Ryan Kidd (MATS) sits on LISA's board. The two programs are complementary -- ARENA teaches engineering skills, MATS provides research mentorship. ARENA's theory of change depends partly on MATS-like programs existing to absorb graduates.
Apollo Research: Deep collaboration on evals curriculum. Apollo's Marius Hobbhahn speaks at ARENA regularly and sits on LISA's Advisory Board. Apollo is a natural employer for ARENA graduates.
UK AISI: Evals curriculum uses AISI's Inspect library. AISI researchers speak at ARENA. Multiple alumni placed at AISI.
TARA, Finnish bootcamp, Zurich AI Safety: Satellite programs using ARENA curriculum demonstrate the materials' quality and adaptability. These programs serve populations ARENA can't reach.
BlueDot Impact: Earlier in the pipeline than ARENA. BlueDot previously hosted ARENA's virtual program. The relationship is complementary.
LASR Labs: Co-located at LISA. Provides a natural next step for ARENA graduates seeking longer-term research experience.

What Would Change This Assessment

Longitudinal alumni tracking showing 70%+ in safety roles after 2 years would significantly strengthen the case for ARENA's impact.
Independent evaluation comparing ARENA alumni outcomes to a matched self-study cohort would resolve the signaling vs. upskilling question.
Funding diversification (even one additional major funder) would reduce the existential single-funder risk.
Evidence that ARENA graduates are systematically preferred over non-ARENA candidates by hiring managers would validate the program's value.
A candid interview with Callum McDougall or James Hindmarch discussing strategic pivots, mistakes, and the signaling critique would reveal leadership thinking.
Data showing that the 8.0 curriculum expansion (alignment science, AI control) produces measurably different outcomes than previous iterations would show responsiveness to the "teaches engineering, not alignment" critique.

Self-Critique

Sources I should have checked but didn't:

The EAGxBerlin 2023 talk by Callum McDougall (Spotify audio only) -- could contain candid founder reflections.
LinkedIn profiles of ARENA alumni to verify placement claims.
LISA job posting for CEO role -- would reveal current organizational state.

Where this analysis is potentially biased:

I may be giving too much weight to the talent needs critique because it is recent and data-backed, when ARENA's value might be in areas the talent needs analysis doesn't capture (community building, identity formation, testing fit).
I may undervalue the signaling function. If ARENA functions primarily as a credential that helps people enter the safety ecosystem, this is genuinely valuable even if it does not "upskill" in the narrow sense.
I am assessing ARENA's current theory of change against the 2026 bottleneck analysis, but the field may shift again. If AI capabilities outstrip safety capacity, demand for junior safety engineers may surge.

What would a thoughtful person who disagrees say? "You're applying a venture capital lens to what is essentially a community institution. ARENA's value isn't measured by whether it solves the right bottleneck on some abstract efficiency analysis. It creates a community of people who care about AI safety, gives them shared experiences and relationships, and makes the field feel real and accessible to newcomers. This community function is irreplaceable even if the engineering skills become commodity. You can't self-study your way into the LISA lunch conversations."

My single weakest claim: That ARENA may be "solving the 2021 problem, not the 2026 problem." The bottleneck analysis is based on 23 interviews at one point in time, and the field is changing fast. ARENA's responsiveness (adding alignment science, AI control content) suggests they are tracking these shifts.

What information would most change my view: Rigorous longitudinal data showing that ARENA alumni are significantly more likely to stay in AI safety and produce research output than a matched comparison group of self-study participants. This would resolve the signaling-vs-upskilling question and validate (or refute) the core theory of change.

Connected to (11)

Anthropicstaff to · Callum McDougall

Apollo Researchcollaborator · Marius Hobbhahn

BlueDot Impactcollaborator

FAR AIboard overlap · Adam Gleave

Google DeepMindstaff to · Callum McDougall

LASR Labscollaborator

London Initiative for Safe AIoperates under

MATScollaborator · Ryan Kidd

Redwood Researchcollaborator

TARAcollaborator

UK AI Safety Institutecollaborator

Sources (52)

Every URL that was read during research.

1.ARENA – AI Safety Educationarena.education
2.Callum McDougall | AI Safety — ARENAarena.education
3.Team — ARENAarena.education
4.Fundamentalslearn.arena.education
5.Alumni | ARENA AI Safety — ARENAarena.education
6.Impact | ARENA AI Safety Education — ARENAarena.education
7.Mission — ARENAarena.education
8.FAQs — ARENAarena.education
9.Curriculum | AI Safety — ARENAarena.education
10.Applications | Apply to ARENA — ARENAarena.education
11.Chapter 1 | AI Safety — ARENAarena.education
12.Chapter 3 | AI Safety — ARENAarena.education
13.ARENA 7.0 Impact Reportgreaterwrong.com
14.ARENA 6.0 Impact Reportgreaterwrong.com
15.ARENA 5.0 Impact Reportgreaterwrong.com
16.ARENA 4.0 Impact Reportgreaterwrong.com
17.ARENA 2.0 - Impact Reportgreaterwrong.com
18.ARENA 8.0 - Call for Applicantsgreaterwrong.com
19.ARENA 7.0 - Call for Applicantsgreaterwrong.com
20.ARENA 6.0 - Call for Applicantsgreaterwrong.com
21.ARENA: AI interpretabilitymonicaspisar.com
22.The London Initiative for Safe AIsafeai.org.uk
23.About — The London Initiative for Safe AIsafeai.org.uk
24.Navigating Transformative AIopenphilanthropy.org
25.Navigating Transformative AIopenphilanthropy.org
26.Fundsopenphilanthropy.org
27.Interviews on Improving the AI Safety Pipelinegreaterwrong.com
28.Lessons from organizing a technical AI safety bootcampgreaterwrong.com
29.AI Safety Talent Needs in 2026: Insights for Field-Building Organizationsforum.nunosempere.com
30.A Proposal for a Better ARENA: Shifting from Teaching to Research Sprintsgreaterwrong.com
31.I Reviewed Hundreds of AI Safety Applications. Here's What Actually Matters | Georg Langegeorglange.com
32.ARENAbluedot.org
33.AI Safety has a scaling problemgreaterwrong.com
34.Joly Scriven | AI Safety — ARENAarena.education
35.GitHub - callummcdougall/ARENA_3.0github.com
36.AI Alignment Research Engineer Accelerator (ARENA): call for applicantsgreaterwrong.com
37.AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0greaterwrong.com
38.TARA | Technical AI Safety Training in APACtaraprogram.org
39.ML Bootcamp — Zurich AI Safetyzurich.aisafety.ch
40.Announcing the London Initiative for Safe AI (LISA)greaterwrong.com
41.Navigating Transformative AIopenphilanthropy.org
42.Want to upskill in technical AI safety? Here are 67 useful resources | 80,000 Hours80000hours.org
43.ARENAaiinterpretability.miraheze.org
44.Using Our Materials | Explore Our Content — ARENAarena.education
45.Widening AI Safety's talent pipeline by meeting people where they aregreaterwrong.com
46.How to work through the ARENA program on your owngreaterwrong.com
47.Chapter 0 | AI Safety — ARENAarena.education
48.Chapter 2 | AI Safety — ARENAarena.education
49.Chapter 4 | AI Safety — ARENAarena.education
50.James Hindmarch | AI Safety — ARENAarena.education
51.David Quarel | AI Safety — ARENAarena.education
52.Nowjameshindmarch.org