← AI Safety Orgs

Orthogonal

Research

Agent foundations, formal verification.

Founded
2023
HQ
France
Team
1
Structure
fiscally sponsored
Model
Donations

Theory of Change

Orthogonal pursues "formal-goal alignment": designing a fully formalized mathematical goal that would produce good outcomes when maximized by any optimizer, including a superintelligent one, then building an AI that pursues that goal. Founder Tamsin Leake:

"You do not align AI; you build aligned AI... I do not expect that current AI technology is of a kind that makes it easy to 'align'; I believe that the whole idea of building a strange non-agentic AI about which the notion of goal barely applies, and then to try and make it 'be aligned', was fraught from the start."

The approach explicitly rejects prosaic alignment (interpretability, RLHF, scalable oversight) as "net harmful when published" -- claiming these methods don't scale to superintelligence and create capability externalities. Instead, Orthogonal proposes building an entirely new AI system from scratch with alignment built into its mathematical foundations.

The concrete research agenda is QACI (Question-Answer Counterfactual Interval): a scheme where an AI scores actions by asking counterfactual questions about information blobs located across possible computational universes. A more recent direction is ESP (Epistemic State Prior), proposed as an alternative to Solomonoff-style priors for avoiding demonic hypotheses.

The org explicitly acknowledges very low success probability: "we still mostly die. I do not expect that our plan saves most timelines... but we will have significantly increased the ratio of worlds that survive."

What They Do

Orthogonal has produced approximately 6 public posts over its roughly 3-year existence, plus an unknown number of deliberately unlisted posts (hidden for exfohazard reasons). The core technical output is:

  • QACI formalization paper (Jun 2023, with Julia Persson): Full mathematical specification of the QACI alignment goal using set theory, quantum Turing machines, blob location functions, and counterfactual insertion.
  • Epistemic State Prior paper (Aug 2024): Proposes modeling a user's epistemic beliefs rather than using computational priors, to avoid malign Solomonoff induction. Described by the author as "pseudomath" and "sketches."
  • Strategy/philosophy posts: Theory of change explanation, LDT application to arms race dynamics, binary AI outcome framing.

Many posts are deliberately unlisted to prevent capability exfohazards. The QACI table of contents references multiple inaccessible posts. A Discord server (~296 members) may contain unpublished discussion.

No publications since August 2024. Last LessWrong comment by Tamsin Leake was December 2024 (about PauseAI, not Orthogonal research). No public update on organizational status. The org may have quietly wound down.

Key People

Tamsin Leake -- Founder and sole confirmed full-time member. Former indie game developer (France). Went through Refine incubator (Conjecture, Aug-Oct 2022). Online handle: "carado."

The founding announcement (Apr 2023) claimed "several promising researchers intending to work fulltime." Only Julia Persson (co-authored one paper) and "mesaoptimizer" (editorial assistance) have been publicly identified. No evidence the planned team materialized. The announced fellowship was never publicly launched.

Money and Incentives

Total confirmed funding: $30,000. This is the only publicly verifiable funding Orthogonal has received.

Source Amount Date Notes
LTFF grant (to Tamsin Leake personally) $30,000 Feb 2023 "6 months research stipend"
Every.org donations Unknown Ongoing Tax-deductible via fiscal sponsorship
  • No Coefficient Giving / Open Philanthropy grants (zero across 480+ AI safety grants in their database)
  • No Survival and Flourishing Fund grants
  • No follow-on LTFF grant in the May 2023-March 2024 payout period
  • No 990 financial data (uses Every.org fiscal sponsorship, not own nonprofit)
  • No corporate or lab funding ties -- completely independent but also completely unsupported
  • No venture capital, government contracts, or compute credits

Legal structure: Fiscal sponsorship under Every.org (EIN 61-1913297). Note: "Orthogonal Productions Inc" (EIN 88-2978002) is a Minneapolis theater company, not this organization.

The $30K stipend covered approximately 6 months at a modest rate. How the work was funded after August 2023 is unknown. France-based cost of living may have extended runway.

What Others Say

Dan MacKinlay (independent researcher): "QACI remains speculative and hard to parse. There are a lot of axioms to buy into to make it look even remotely feasible. It's not obvious how to connect 'blobs' to real-world referents, or why this formulation really sidesteps Goodhart's Law. Even insiders hedge on whether it's the right line to pursue."

Anonymous LW commenter comparing QACI to HCH: Identified shared failure modes including need for inner alignment, risk of "Chinese whispers" fidelity loss, convergence problems across iterations, memetic selection pressure, amplified failure probability, and vulnerability to acausal tampering.

Community engagement: A comprehensive 2022-2023 survey of alignment approaches (75+ hours of work covering MIRI, ARC, Anthropic, DeepMind, Redwood, Conjecture, and others) does not mention Orthogonal, QACI, or Tamsin Leake at all. This reflects near-zero visibility in the broader alignment community.

Manifold Markets: "Will QACI turn out to be a viable alignment plan?" prediction market shows roughly 20-51% probability (varying by market version), reflecting deep ambivalence.

No Refine evaluator (Steve Byrnes, Vanessa Kosoy, Evan Hubinger, Ramana Kumar, John Wentworth) has publicly commented on Tamsin's work. No MIRI researcher has publicly engaged with QACI despite Orthogonal explicitly claiming to share MIRI's worldview and seeking collaboration.

What's Absent

  • 19+ months with no publication (last: Aug 2024). No public explanation of status.
  • Team never materialized as described in the founding announcement. Remains effectively a solo project.
  • Fellowship never launched despite being prominently announced.
  • MIRI collaboration never happened despite being a stated goal.
  • No peer review of the QACI formalization by independent mathematicians, despite the approach's entire value proposition being mathematical rigor. Tamsin self-describes as "bad at math."
  • No endorsement from any major AI safety funder or established alignment researcher.
  • No comparison with davidad/OAA, the other major "formal alignment" approach which has received ARIA funding.
  • Total funding received is opaque -- no 990s, no annual reports, no financial disclosures.

Recommended Reading

  1. Tamsin Leake's LessWrong profile and comments (https://www.greaterwrong.com/users/tamsin-leake) -- Most candid window into the founder's thinking. Detailed QACI explanations, exfohazard views, PauseAI advocacy, criticism of AI labs.

  2. Dan MacKinlay's agent foundations assessment (https://danmackinlay.name/notebook/agent_foundations.html) -- The strongest independent external critique of QACI.

  3. Orthogonal's theory of change (https://www.alignmentforum.org/posts/4XcADCLDDguyej2N7/orthogonal-s-formal-goal-alignment-theory-of-change) -- The core document explaining why this org exists and what it rejects about mainstream alignment.

  4. Formalizing the QACI alignment formal-goal (https://www.alignmentforum.org/posts/MR5wJpE27ymE7M7iv/formalizing-the-qaci-alignment-formal-goal) -- The actual technical output; form your own view of the math.

Show Claude’s analysis
An opinionated read. Read the brief first to form your own view.

Stated Theory of Change

Orthogonal claims that mainstream alignment research (interpretability, RLHF, scalable oversight, etc.) is fundamentally unable to produce alignment that scales to superintelligence, and is likely "net harmful when published" because it generates capability externalities. Instead, the only viable path is:

  1. Design a fully formalized mathematical goal (QACI/ESP) such that an unbounded optimizer maximizing it would take actions humans consider desirable.
  2. Build a new AI system from scratch that pursues this goal -- not retrofit alignment onto existing AI.
  3. Launch this system as a "one-shot singleton" that takes over and steers the world toward good outcomes.

The causal chain: Orthogonal formalizes QACI/ESP --> independent or AI-assisted implementation of a QACI-aligned optimizer --> that optimizer achieves decisive strategic advantage --> utopia (or at least, a meaningfully increased fraction of surviving timelines).

Revealed Theory of Change

The gap between stated theory and revealed behavior is wide:

Stated: Orthogonal is an organization with multiple researchers doing urgent, world-saving work. Revealed: Orthogonal is effectively a solo project by Tamsin Leake, producing approximately one substantive post every 3-4 months (plus unlisted work of unknown scope), with no publications in 19+ months.

Stated: QACI is "our best shot" at saving the world, requiring urgent progress. Revealed: The research program has shifted from QACI to ESP (which Tamsin describes as superseding QACI2), and then appeared to halt entirely. Recent comments advocate donating to PauseAI, suggesting a possible update toward "buy time" strategies over "solve alignment."

Stated: Collaboration with MIRI is a priority. Revealed: No evidence of any MIRI engagement. No endorsement from any established alignment researcher.

Stated: Exfohazard caution motivates unlisting posts. Revealed: This policy may also serve to shield the work from the kind of external criticism that could reveal fundamental problems. The posts that ARE public have received very limited engagement.

The most charitable reading is that significant work continues in private (Discord, unlisted posts). The least charitable reading is that the project has stalled due to a combination of funding constraints, team-building failure, and possible recognition that QACI's foundational problems are not easily resolvable.

Key Assumptions

Assumption 1: Agent foundations are the right framing for alignment.

  • Evidence for: Eliezer Yudkowsky and some MIRI-adjacent researchers believe current AI paradigms cannot be made safe; a "clean target" is needed. Orthogonal's pre-founding argument (027-gw-agent-foundations-needed.md) makes a detailed case.
  • Evidence against: The broader alignment community has largely moved toward empirical/prosaic approaches. MIRI's own leadership stopped pursuing technical alignment agendas by 2023. davidad/OAA represents an alternative formal approach with more institutional backing (ARIA). The field's most impactful safety work to date has been prosaic (RLHF, constitutional AI, evaluations).
  • Testable?: Only in the sense that if prosaic alignment proves sufficient, agent foundations were unnecessary. If prosaic alignment fails catastrophically, it will be too late to test.
  • If wrong: Orthogonal's entire research program is misdirected, and the claimed "net harm" of prosaic alignment work is itself net harmful (by discouraging useful work).

Assumption 2: QACI's mathematical formalization is correct and meaningful.

  • Evidence for: The QACI paper is unusually detailed for an alignment proposal; the mathematical specification is genuine, not hand-waving.
  • Evidence against: The founder self-describes as "bad at math." No independent mathematician has verified the formalization. The ESP pivot suggests QACI had unresolved problems. The blob location problem (mapping mathematical bitstrings to physical reality) remains open. External critics say it is "speculative and hard to parse" with failure modes similar to HCH.
  • Testable?: In principle, the math can be checked. In practice, the unlisted posts and custom notation make independent verification difficult.
  • If wrong: The core technical output is flawed, and years of work on QACI were wasted.

Assumption 3: A one-shot singleton is the most likely and best AI outcome.

  • Evidence for: Standard MIRI-style arguments about decisive strategic advantage and convergent instrumental goals.
  • Evidence against: Current AI development is multipolar and distributed. No single actor controls enough resources for a singleton. Slow takeoff scenarios (which many researchers now favor) make one-shot approaches less relevant.
  • If wrong: Orthogonal's entire framework (design a goal for a single all-powerful AI) is solving the wrong problem.

Assumption 4: A team of 1 person with $30K can make meaningful progress on the hardest problem in alignment.

  • Evidence for: Some of the most important alignment ideas have come from individual thinkers (Yudkowsky, Christiano). Small teams can move fast.
  • Evidence against: MIRI spent $20M+ over 15+ years with a much larger team and openly acknowledged failure to solve agent foundations. The mathematical difficulty of formalizing a complete alignment target is enormous. The lack of collaborators means no one is checking the work.
  • If wrong: Orthogonal's underfunding is not a constraint to be worked around but a fundamental blocker.

Strengths

  1. Intellectual honesty about the difficulty: Tamsin is upfront about high P(doom) and low success probability. There is no salesmanship or hype.
  2. Genuine mathematical formalization: QACI is one of very few alignment proposals that actually specifies a formal goal in mathematical detail rather than gesturing at concepts. This is valuable regardless of whether QACI itself is correct.
  3. No incentive misalignment: Zero lab funding, zero corporate ties. Orthogonal cannot be accused of pulling its punches to protect a funder's business model.
  4. Philosophical clarity: The "backchaining" approach and rejection of prosaic alignment is clearly reasoned, even if one disagrees.
  5. Research evolution: The QACI-to-ESP pivot suggests genuine intellectual progress, not dogmatic attachment to a single idea.

Weaknesses and Risks

  1. Near-total lack of external validation: No established alignment researcher has endorsed QACI. No MIRI collaboration materialized. No Refine evaluator has commented publicly. No major funder has provided follow-on support. This is the most damning pattern.
  2. Team failure: The "several promising researchers" never appeared. The fellowship never launched. Three years in, this is still a solo project. Agent foundations problems are extremely hard -- MIRI's team of 10+ couldn't solve them. Expecting one person to succeed where MIRI failed is unrealistic.
  3. Publication gap: 19+ months without publication, in an org that claims short timelines make urgency paramount, suggests either the project has stalled or the research has hit a wall.
  4. Self-sealing epistemology: The combination of (a) claiming most alignment work is "net harmful," (b) unlisting posts for exfohazard reasons, and (c) operating in a small Discord creates a bubble resistant to correction. If the approach is fundamentally flawed, there is no external mechanism to surface that.
  5. Mathematical rigor gap: The founder acknowledges being "bad at math" while pursuing an approach whose entire value proposition is mathematical rigor. The ESP paper is self-described as "pseudomath." This tension has not been resolved by recruiting mathematically stronger collaborators.
  6. MIRI's precedent: MIRI pursued agent foundations for 15+ years with more funding, more people, and deeper mathematical talent, and their leadership eventually acknowledged they couldn't find a viable approach. Orthogonal does not adequately explain why QACI/ESP will succeed where MIRI's research program failed.

Cross-References

  • MIRI: Orthogonal explicitly claims to share MIRI's worldview. MIRI's trajectory (15+ years of agent foundations work, eventual shift away from technical alignment) is the most relevant precedent. Orthogonal appears to be repeating MIRI's early approach at 1/100th the scale.
  • davidad/Open Agency Architecture (OAA): Competing "formal alignment" approach with ARIA backing. OAA and QACI should be directly comparable but have not been compared. OAA has more institutional support and more mathematical collaborators.
  • Conjecture: Historical connection through Refine incubator. Conjecture's own trajectory (pivots, departures, controversy) provides context for the post-incubator independent research path.
  • PauseAI: Tamsin's recent advocacy for PauseAI funding suggests a possible strategic update -- if technical alignment timelines are too short, buying time through advocacy becomes higher-priority.

What Would Change This Assessment

  • Evidence of ongoing substantial research output (even in private/Discord) would significantly update upward. If Tamsin has been producing QACI/ESP work that simply isn't public, the assessment of "stalled" would be wrong.
  • An independent mathematician verifying and extending the QACI formalization would address the most serious technical concern.
  • MIRI researchers publicly engaging with QACI (positively or critically) would break the pattern of community silence.
  • A team of 2+ competent alignment researchers joining Orthogonal would address the "solo project" weakness.
  • If prosaic alignment approaches demonstrably fail at scale (e.g., a serious alignment failure from a frontier lab), the case for agent foundations approaches like QACI would strengthen dramatically.
  • If Tamsin publicly explains the publication gap -- whether research hit a wall, funding ran out, or priorities shifted -- this would allow a more accurate assessment.

Self-Critique

Weakest claim: My assessment that the project has "stalled" is based on public evidence only. If substantial work is happening in Discord or unlisted posts, this assessment could be significantly wrong. I have no visibility into the ~300-person Discord community.

Potential bias: I may be implicitly anchoring to the norm of "alignment orgs should publish frequently and visibly." Tamsin's exfohazard argument -- that secrecy is genuinely protective -- deserves more weight than I may be giving it. However, the absence of any update, even something as simple as "we're still working," over 19 months is hard to explain by exfohazard caution alone.

What a thoughtful disagreer would say: "You're evaluating a theoretical research project by the standards of a startup or policy org. Einstein published 4 papers in one year then was quiet for years. QACI-style work requires deep thinking, not quarterly output. The fact that nobody famous endorses it is exactly what you'd expect for a genuinely novel approach -- breakthroughs aren't born with consensus. And the criticism that QACI is 'speculative' is trivially true of every agent foundations approach, including ones you'd probably rate more favorably."

What I'd most want to know: Whether the Refine evaluators (Byrnes, Kosoy, Hubinger, et al.) have any private assessment of Tamsin's work. Their silence could mean "not worth commenting on" or "interesting but we haven't engaged enough to say."

Connected to (3)

Long-Term Future Fundcollaborator
Machine Intelligence Research Institutecollaborator
Conjecturestaff from · Tamsin Leake
Sources (31)
Every URL that was read during research.
  1. 1.Orthogonalorxl.org
  2. 2.Orthogonal: A new agent foundations alignment organizationorxl.org
  3. 3.Orthogonalcarado.moe
  4. 4.Orthogonal: A new agent foundations alignment organizationgreaterwrong.com
  5. 5.Orthogonal's Formal-Goal Alignment theory of changegreaterwrong.com
  6. 6.formalizing the QACI alignment formal-goalgreaterwrong.com
  7. 7.[missing post]greaterwrong.com
  8. 8.Epistemic states as a potential benign priorgreaterwrong.com
  9. 9.How LDT helps reduce the AI arms racegreaterwrong.com
  10. 10.We're all in this togethergreaterwrong.com
  11. 11.[missing post]greaterwrong.com
  12. 12.[missing post]greaterwrong.com
  13. 13.April 2023: Long-Term Future Fund grant recommendations | Effective Altruism Fundsfunds.effectivealtruism.org
  14. 14.Long-Term Future Fund: May 2023 to March 2024 Payout recommendationsgreaterwrong.com
  15. 15.Tamsin Leakegreaterwrong.com
  16. 16.[missing post]greaterwrong.com
  17. 17.[missing post]greaterwrong.com
  18. 18.(My understanding of) What Everyone in Technical Alignment is Doing and Whygreaterwrong.com
  19. 19.[missing post]greaterwrong.com
  20. 20.How to Diversify Conceptual Alignment: the Model Behind Refinegreaterwrong.com
  21. 21.[missing post]greaterwrong.com
  22. 22.Tamsin Leakeea.greaterwrong.com
  23. 23.You won’t solve alignment without agent foundationsgreaterwrong.com
  24. 24.[missing post]greaterwrong.com
  25. 25.[missing post]greaterwrong.com
  26. 26.Agent Foundations for Superintelligence-Robust Alignmentagentfoundations.study
  27. 27.Orthogonal Productions Inc - Nonprofit Explorer - ProPublicaprojects.propublica.org
  28. 28.Tamsin Leake comments on Let's build definitely-not-conscious AIgreaterwrong.com
  29. 29.Tamsin Leake comments on Orienting to 3 year AGI timelinesgreaterwrong.com
  30. 30.Agent foundations – The Dan MacKinlay stable of variably-well-consider’d enterprisesdanmackinlay.name
  31. 31.I don't think MIRI "gave up"greaterwrong.com