AE Studio

Research

Commercial studio funding alignment.

Founded: 2016
HQ: Venice/Los Angeles, CA
Team: 175
Structure: for-profit LLC
Model: Product Revenue

Theory of Change

AE Studio's theory of change operates at two levels:

Meta-level: The space of alignment approaches is vastly underexplored, and the field is stuck in local optima (evals, mechanistic interpretability). Their survey of ~125 alignment researchers found that researchers broadly agree current work is not on track to solve alignment and does not cover the space of plausible approaches. AE's role is to systematically identify, evaluate, and implement neglected alignment approaches -- "individually unlikely to work, but very high impact if they do."

Object-level: Their primary technical approach is biologically inspired alignment: reverse-engineering prosociality from cognitive neuroscience and implementing it in AI systems. The flagship result is Self-Other Overlap (SOO) fine-tuning, which reduces deceptive behavior by aligning how AI models represent themselves vs. others -- inspired by neuroscience showing that human empathy and prosociality correlate with neural overlap between self and other representations.

From their neglected approaches blog post: "We suspect that pursuing a diversified set of promising neglected approaches would afford greater exploratory coverage of this space... groundbreaking innovations are often found in some highly unexpected places, seeming to many as implausible, heretical, or otherwise far-fetched -- until they work."

The operational model is: bootstrap a profitable consulting business, use profits to fund alignment R&D with no external investor pressure, find brilliant individuals with neglected approaches, and provide engineering support to make those ideas work.

What They Do

Research output:

Self-Other Overlap (SOO) fine-tuning: reduced deceptive responses from 73.6% to 17.2% (7B model), 100% to 9.3% (27B), and 100% to 2.7% (78B) with minimal capability loss. NeurIPS 2024 oral presentation. Total compute: ~65 min on a single A100. Eliezer Yudkowsky: "Not obviously stupid on a very quick skim... I rarely give any review this positive."
Self-modeling paper (2024): training neural networks to predict their own internal states induces simplification. Collaboration with Michael Graziano (Princeton). Small-model results only (MNIST, CIFAR, IMDB).
PromptInject: Best Paper, NeurIPS ML Safety Workshop 2022.
Endogenous Steering Resistance (March 2026): LLMs self-correct when steered off-topic. Found ~27 "off-topic detector" neurons. UK AI Security Institute grant for follow-up.
AI consciousness research: suppressing deception features makes models more likely to report consciousness.
Alignment researcher survey: 125 respondents, donated $3,720 to safety orgs.

BCI (pre-pivot): Blackrock Neurotech collaboration (MoveAgain platform), Forest Neurotech partnership, Neural Latents Benchmark Challenge win, open-source tools (NDK, Neural Data Simulator), NWB standards contributions.

Commercial: AI/software consulting for startups and Fortune 10 companies. Client work for Goodfire (interpretability startup). Built and sold ElectricSMS (subscription management) to ReCharge. Internal fitness platform Instill.

Policy: Alliance for Secure AI (separate 501(c)(3), launched June 2025) doing bipartisan AI policy advocacy. Judd publishes WSJ op-eds and City Journal pieces, engages with Congress and NSC. The Alliance is staffed entirely by political operatives with no technical researchers.

Key People

Judd Rosenblatt -- CEO and founder. Yale cognitive science. Bootstrapped AE from 2016. Self-identifies as conservative. EA-influenced. Prolific public communicator. Lives off wife's salary, which he credits with enabling long-term thinking. Previously founded Crunchbutton (food delivery).

Marc Carauleanu -- Lead SOO researcher. Was an undergraduate when Judd recruited him from EAG London 2023. Lead author on the NeurIPS 2024 SOO paper, AE's most significant technical result.

Cameron Berg -- Research Director. Yale cognitive science, former Meta AI. Leads consciousness and alignment research. Published the alignment researcher survey.

The team is ~150-198 total, but the alignment research team size is undisclosed. The vast majority are commercial developers. The alignment-specific researchers appear to number fewer than 10.

Money and Incentives

Revenue: ~$31.6M/year (third-party estimate, unconfirmed). 100% from AI/software consulting.

Structure: Bootstrapped for-profit LLC. No VC, no PE, no outside shareholders. Judd and Melanie Plaza (CTO/wife) effectively control the company. This independence is the central financial claim: profits fund alignment R&D with no investor pressure to prioritize commercial returns.

Alignment spending: Unknown. This is the critical financial gap. The dollar amount and percentage of revenue/profit going to alignment R&D is never disclosed. 5% of profits goes to effective charities (roughly ~$200K-250K/year based on typical consulting margins). The alignment budget is likely larger but is not public.

External funding: Minimal. One Foresight Institute grant for SOO research (amount unknown). UK AI Security Institute grant for ESR research. No Coefficient Giving/Open Phil grants (ineligible as for-profit). AE's stated preference is self-funding to "retain agency."

Equity model: Employees receive "profits interests" in client equity and internal startups rather than AE stock. This diversifies employee upside but means employee financial incentives are tied to client/startup outcomes, not alignment work.

EVEnet: A blockchain/crypto project with the same leadership team planned a $EVE token launch for Spring 2024 with AI Safety R&D allocations. No evidence of launch. Project appears dormant. No public explanation.

Incentive analysis: AE has no structural conflict between safety and deployment (they don't build frontier AI). The relevant tension is between alignment spending and commercial profitability. Alignment work generates good PR, recruits talented engineers, and builds reputation -- which partially aligns commercial and safety incentives. But if alignment work became unprofitable or controversial, there is no governance mechanism beyond Judd's personal values to maintain the commitment.

What Others Say

Positive:

Eliezer Yudkowsky on SOO: "Not obviously stupid on a very quick skim... I rarely give any review this positive." Subsequent conversations confirmed he considers SOO "a worthwhile agenda to explore."
Nathan Labenz (Cognitive Revolution host) called AE's work "really fascinating" and their consciousness research "one of the very best scientific inquiries into the possibility of AI consciousness."
TsviBT (alignment researcher) left positive comments on the neglected approaches post.
Scott Alexander cited their consciousness research as "the only exception" to typical AI consciousness discussions.

Critical:

LW commenters raised the strongest objection to SOO: "training against internal state" -- models may learn to hide deceptive representations from the SOO loss function while remaining deceptive in ways not captured by the targeted layers.
Bidirectional concern: SOO may degrade theory of mind, preventing an agent from detecting when it is being deceived.
Toy environment limitation: all SOO results are from simplified deception scenarios, not real-world deceptive alignment.
JD Pressman (minihf.com) engaged substantively with consciousness research but offered alternative frameworks for interpreting model self-reports.

Absence of criticism: Despite extensive searching, almost no independent external criticism of AE Studio was found beyond comment-level technical objections. The most likely explanation is that AE is too small and too new to attract the scrutiny directed at larger alignment organizations. This may change if their work scales.

What's Absent

Alignment team size and budget: the most important unknown. How many researchers actually work on alignment vs. commercial consulting?
Independent replication of SOO: no published attempts to replicate the core results.
Frontier-model SOO results: all experiments are on 7B-78B models, not frontier systems.
Alignment Angels outcomes: $50K seed funding competition announced Dec 2023, no public results.
EVEnet status: crypto project with AI safety R&D allocations, planned Spring 2024 launch, apparently abandoned without explanation.
Self-modeling on LLMs: ~2 years after publication, self-modeling results remain on toy models only.
Financial transparency: no published financials, alignment spending, or profit margins.

Stated Theory of Change

AE Studio claims a two-level theory of change:

Meta-level: The alignment field is stuck in local optima. Most researchers and funding concentrate on mechanistic interpretability and evals, which are valuable but insufficient. AE's role is to systematically find and implement "neglected approaches" -- directions that are individually unlikely to succeed but have high expected value given the vast unexplored space of alignment strategies.

Object-level: Biologically inspired alignment, primarily Self-Other Overlap (SOO) fine-tuning, which reduces AI deception by aligning how models represent self and other. The broader research agenda includes self-modeling, consciousness research, reverse-engineering prosociality, and various neuro-inspired approaches.

The operational mechanism is unusual: a bootstrapped consulting business generates ~$31.6M in revenue, and profits cross-subsidize alignment R&D. No external investors means no pressure to prioritize commercial returns over safety research.

Revealed Theory of Change

AE's actions largely align with their stated theory, but with important caveats:

What they actually publish: SOO (one paper, one workshop oral), self-modeling (one paper on toy models), PromptInject (one workshop paper), ESR (one blog post/paper), alignment survey, several blog essays, and consciousness research. This is roughly 2 substantive publications per year since the alignment pivot. Respectable for a side project of a consulting firm; modest for a dedicated alignment org.

Where the money goes: Unknown. This is the most significant divergence between stated and revealed theory. AE claims profits fund alignment, but the actual allocation is never disclosed. The vast majority of 150-198 employees work on commercial consulting. The alignment research team appears to be fewer than 10 people.

Time allocation of the CEO: Judd Rosenblatt's public activity splits roughly evenly between alignment advocacy (podcasts, LW posts, policy engagement, conferences) and commercial operations. His policy engagement (WSJ, City Journal, Congress, NSC) receives significant attention, suggesting AE's theory of change places weight on political influence alongside technical research.

The EVEnet question: AE's leadership launched a blockchain/crypto project (EVEnet) with the same core team in 2023-2024, planned a token launch, and then apparently let it go dormant without explanation. This suggests either significant distraction from alignment work or a quick update-and-pivot (consistent with their stated agility).

What they don't do: AE doesn't build or deploy frontier AI. They don't compete with labs. They don't do evals or red-teaming as a service. They focus narrowly on neuro-inspired alignment research and policy advocacy. This is consistent with a genuine focus on neglected approaches rather than following the crowd.

Key Assumptions

1. The alignment space is underexplored and neglected approaches have high EV.

Evidence for: Their survey of 125 alignment researchers supports this. The history of science shows breakthrough discoveries from neglected directions.
Evidence against: The EA ITN framework can justify any research by labeling it "neglected." The most popular approaches (interp, evals) may be popular because they're most tractable, not because of herding.
Testable: Yes, by whether neglected approaches produce results. SOO is one data point in favor.
If wrong: AE's meta-strategy is less valuable, though individual results (SOO) could still matter.

2. SOO generalizes from toy deception to real alignment threats.

Evidence for: Results across 3 model sizes, 7 generalization scenarios, 2 extended scenarios. Larger models show greater reduction in deception. Eliezer's positive review.
Evidence against: All scenarios involve simple, legible deception (burglar Bob). Real deceptive alignment involves concealed optimization targets, instrumental convergence, and mesa-objectives that are fundamentally different. The "training against internal state" critique suggests models could hide deception from SOO's loss function.
Testable: Yes, by applying SOO to frontier models and testing against sophisticated deception benchmarks.
If wrong: SOO is an interesting result that doesn't solve the core alignment problem, similar to RLHF -- useful for current systems but insufficient for superhuman AI.

3. A for-profit consulting firm can sustain high-quality alignment research.

Evidence for: AE has produced genuine results. Bootstrapped independence avoids funder pressure. Judd's personal commitment appears sincere.
Evidence against: No governance mechanism protects alignment commitment. Alignment team is tiny relative to commercial operations. Financial incentives eventually dominate personal values.
Testable: Yes, over time. Watch for: alignment team size trends, publication rate, whether Judd's attention drifts.
If wrong: AE becomes a consulting firm that once did some interesting alignment work -- like many companies with abandoned "labs" divisions.

4. Biologically inspired approaches transfer to AI systems.

Evidence for: SOO results. Self-modeling induces simplification. Neuroscience provides a rich source of hypotheses about prosociality and cooperation.
Evidence against: Neural networks are fundamentally different from biological brains. The analogy between human empathy (evolved over millions of years in social contexts) and fine-tuning loss functions is loose at best. "Inspired by" is different from "implements the mechanism."
If wrong: The neuro-inspired framing is marketing rather than substance, though individual techniques could still work for other reasons.

5. Alignment can have a "negative alignment tax."

Evidence for: RLHF improved capabilities. AE's SOO maintains capabilities while reducing deception. Ten examples cited in their military-grade essay.
Evidence against: The hardest alignment problems (preventing deceptive alignment in superintelligent systems) almost certainly impose costs. The MIRI critique: alignment techniques work until the system is smart enough to game them. Current "negative tax" examples may reflect a pre-catastrophic regime.
If wrong: AE's policy message ("alignment is good for business") becomes misleading, and the political support built on this message is fragile.

Strengths

Genuine independence. No investors, no major funder, no lab affiliation. AE can research whatever they believe matters without permission from anyone. This is rare and valuable in a field where most orgs depend on Open Phil or AI lab funding.
The SOO result is real. A concrete technique that reduces deception with minimal capability cost, works across model scales, and received Eliezer's rare positive review. Whether it solves alignment is debatable, but it's a genuine contribution.
Recruiting and supporting neglected researchers. The model of finding individuals with strong hunches (Marc Carauleanu as an undergrad, Michael Graziano at Princeton) and providing engineering support to test their ideas is a genuine organizational innovation in the alignment field.
Political positioning. Judd's conservative identity and bipartisan framing access political constituencies that the alignment community largely ignores. If AI safety becomes a partisan issue, having credible conservative voices is extremely valuable.
Proven ability to execute in hard domains. The BCI track record (Blackrock, Forest Neurotech, Neural Latents win) establishes that AE can do real work in technically demanding fields, not just write blog posts.
Low overhead alignment research. SOO fine-tuning on a single A100 in 65 minutes is extraordinarily capital-efficient compared to safety research at frontier labs. AE demonstrates that useful alignment research doesn't require billion-dollar compute budgets.

Weaknesses and Risks

Scale mismatch. AE describes itself with Manhattan Project / Bell Labs ambitions but has perhaps 5-10 alignment researchers working on cross-subsidized profits from a mid-size consulting firm. The gap between aspiration and resources is enormous.
Toy environment problem. SOO works on burglar Bob scenarios. The alignment community's core concern is deceptive alignment in superhuman systems. The distance between these is not incremental -- it's potentially qualitative. SOO might not help at all with the hardest alignment problems.
No accountability structure. One person (Judd) makes all decisions about alignment direction and spending. No board, no external oversight, no published metrics. If Judd's priorities shift, there is nothing structural to maintain AE's safety commitment.
Opacity about alignment investment. AE claims profits fund alignment but never discloses how much. This makes it impossible to distinguish between "a consulting firm with a serious alignment division" and "a consulting firm with an alignment marketing strategy."
EVEnet and credibility risk. Launching a crypto token with AI Safety R&D allocations and then apparently abandoning it without explanation is a credibility concern. The alignment community already views crypto-adjacent projects skeptically.
Breadth without depth risk. The agenda includes SOO, self-modeling, consciousness research, BCI, policy advocacy, startup incubation, field-building, and alignment angels. For 5-10 researchers, this is too many directions. The risk is spreading thin rather than building deep expertise in any one area.
Publication rate is modest. ~2 substantive papers per year is reasonable for a side project but thin for an org claiming to pursue the "most important work on the planet." Dedicated alignment orgs at similar size produce more.

Cross-References

Complementary to: Redwood Research (AE lists them as collaborator on AI control), Goodfire (AE does engineering work for them), Foresight Institute (grant funder, consciousness research connection), MATS (AE cited their framework, donated survey proceeds).

Contrasts with: MIRI (theoretical vs. AE's empirical, "negative alignment tax" vs. MIRI's pessimism about alignment being cheap), Anthropic (frontier lab with safety division vs. consulting firm with alignment side project).

Fills a gap: No other alignment org combines for-profit independence, neuro-inspired approaches, conservative political positioning, and the "find and implement neglected ideas" operational model. AE occupies a unique niche.

Potential redundancy: SOO-style representation engineering overlaps with work at Center for AI Safety (RepE), Anthropic (Constitutional AI), and others working on model internals. AE's specific biological inspiration differentiates them but the general approach has competitors.

What Would Change This Assessment

Strongly upward:

SOO results replicate on frontier models (GPT-4 class) with similar deception reduction
Independent replication by another group confirms core results
Disclosure of alignment budget showing >20% of revenue or >$5M/year dedicated to alignment
Alignment team grows to 20+ researchers with dedicated leadership

Strongly downward:

SOO fails to generalize beyond "burglar Bob" style deception when tested on sophisticated scenarios
Alignment team size stagnates or shrinks while commercial operations grow
Judd's public attention shifts primarily to policy/politics and away from technical research
EVEnet token launches and creates conflicts with alignment mission

Moderate updates:

Self-modeling results scale to LLMs (upward)
Alignment Angels competition results are published showing funded companies (upward)
Financial transparency about alignment spending (either direction depending on numbers)
Key alignment researcher departures with critical public statements (downward)

Self-Critique

Weakest claim: My assessment that the alignment team is "fewer than 10" people is based on inference from the team page (which lists mostly commercial roles) and the absence of disclosure, not on confirmed data. If AE has, say, 25 alignment researchers, the assessment would need significant revision upward.

Potential bias: I may be giving excessive weight to the "toy environment" critique because sophisticated deceptive alignment is the alignment community's central concern. But SOO reducing deception in current models has practical value even if it doesn't solve the existential problem.

What a thoughtful disagreer would say: "You're applying big-org standards to a bootstrapped startup that's 3 years into an alignment pivot. AE has produced more concrete alignment results per dollar than most grant-funded orgs. The EVEnet thing was a minor side project. The lack of financial transparency is normal for private companies. Judge them on their research output, not their governance structure."

What information would most change my view: The alignment team size and budget. If AE discloses that 30 people work on alignment with a $5M+ annual budget, this is a serious alignment org masquerading as a consulting firm. If the number is 3 people spending $300K, it's a consulting firm with an alignment hobby. The answer to this single question dominates everything else.

Sources I should have checked: Full Glassdoor reviews (for internal culture and alignment team dynamics), the Ami Magazine interview with Judd (unavailable due to 403 error), and the WSJ op-ed full text (paywalled). I also could not access the second Cognitive Revolution podcast (Cameron Berg, Nov 2025) in full due to size constraints, though I read the catalog description and key excerpts.

Connected to (9)

AI Safety Campcollaborator Foresight Institutecollaborator Goodfirecollaborator MATScollaborator Redwood Researchcollaborator

Alliance for Secure AIcollaborator · Judd Rosenblatt

Princeton Universitycollaborator · Michael Graziano

Blackrock Neurotechcollaborator

Forest Neurotechcollaborator

Sources (60)

Every URL that was read during research.

1.AE Studioae.studio
2.AE Studio | AI Alignment Researchae.studio
3.AE Studioae.studio
4.AE Studioae.studio
5.AE Studioae.studio
6.AE Studio - Building AI for 10+ yearsae.studio
7.The 'Neglected Approaches' Approach: AE Studio's Alignment Agendaae.studio
8.Alignment can be the “Military-Grade Engineering” of AIae.studio
9.Collaborate with us to advance neurotechnology and AI that increase human agency.research.ae.studio
10.Support AI Alignment at AE Studiobearcomputerinterface.com
11.The Hidden Innovation Engine | AE Studiochoosingneglectedapproaches.com
12.AE Studio AI Alignment Researchae.studio
13.Biologically Inspired AI Alignment: Exploring Neglected Approaches with AE Studio's Judd and Mikecognitiverevolution.ai
14.Judd Rosenblatt on Solving Alignment - The Alliance for Secure AI Actionsecureainow.org
15.The AI Arms Racecity-journal.org
16.Towards Safe and Honest AI Agents with Neural Self-Other Overlaparxiv.org
17.Unexpected Benefits of Self-Modeling in Neural Systemsarxiv.org
18.Towards Safe and Honest AI Agents with Neural Self-Other Overlaparxiv.org
19.EVEnetevenetai.com
20.on the eve of AGI...evenetai.com
21.GitHub - agencyenterprise/PromptInject: PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022github.com
22.Marc Carauleanu Leads Advances in AI Safety with Breakthrough on Deception Reductiontechtimes.com
23.More Truthful AIs Report Conscious Experience: New Mechanistic Research w- Cameron Berg @ AE Studiocognitiverevolution.ai
24.Blackrock Neurotech Collaborates with AE Studio to Advance Training and Calibration in the First Commercial BCI Platform, MoveAgainprnewswire.com
25.How We Built and Sold a Startupae.studio
26.AI Solutions & Case Studies | AE Studio Projectsae.studio
27.Who's Actually Preventing the Paperclip Apocalypse? A Field Guide to AI Alignment Organizationsaestudio.ghost.io
28.Alignment can be the “Military-Grade Engineering” of AIaestudio.ghost.io
29.TsviBT comments on The 'Neglected Approaches' Approach: AE Studio's Alignment Agendagreaterwrong.com
30.About the Alliance for Secure AIsecureainow.org
31.Alliance for Secure AI - Wikipediaen.wikipedia.org
32.AE Studioae.studio
33.The 'Neglected Approaches' Approach: AE Studio's Alignment Agendagreaterwrong.com
34.Key takeaways from our EA and alignment research surveysgreaterwrong.com
35.Self-Other Overlap: A Neglected Approach to AI Alignmentgreaterwrong.com
36.AE Studio @ SXSW: We need more AI consciousness research (and further resources)ea.greaterwrong.com
37.AE Studio – tech designed with human agency in mindlongevity.technology
38.AE Studioae.studio
39.Reducing LLM deception at scale with self-other overlap fine-tuninggreaterwrong.com
40.Judd Stern Rosenblatt - AE Studioae.studio
41.Comment on 'AE Studio @ SXSW: We need more AI consciousness research (and further resources)'minihf.com
42.AE Studioae.studio
43.AE Studioae.studio
44.The AI Paperclip Problem: What Is AI Alignment?ae.studio
45.Unexpected Benefits of Self-Modeling in Neural Systemsarxiv.org
46.Cameron Berg: Why Do LLMs Report Subjective Experience? — The Partnership for Research Into Sentient Machinesprism-global.com
47.The Evidence for AI Consciousness, Todayai-frontiers.org
48.What's Consciousness?essays.ae.studio
49.GitHub - agencyenterprise/neurotechdevkit: Neurotech Development Kit (NDK)github.com
50.The Alliance for Secure AI Actionsecureainow.org
51.The Alliance for Secure AI Staffsecureainow.org
52.The 'Neglected Approaches' Approach: AE Studio's Alignment Agendaessays.ae.studio
53.The case for a negative alignment taxgreaterwrong.com
54.A Better Startup Equity Planae.studio
55.Key takeaways from our alignment research surveyaestudio.ghost.io
56.Making a conservative case for alignmentgreaterwrong.com
57.We Donate 5% Of Our Profits - Could We Do More If We Didn't?ae.studio
58.LLMs Can Resist Being Steered Off-Topic | AE Studio Researchae.studio
59.Cameron Berg | AI Frontiersai-frontiers.org
60.Marc Carauleanu comments on Self-Other Overlap: A Neglected Approach to AI Alignmentgreaterwrong.com