Theory of Change
Haize Labs' stated theory of change rests on two premises: (1) frontier AI labs cannot be trusted to evaluate their own models because of misaligned incentives, and (2) third-party automated red-teaming can fill this gap at scale.
Leonard Tang, co-founder and CEO: "One of the reasons we started the company was because we felt that you could not take the frontier labs at their face value, when they said that they were the best and safest model. Somebody needs to be a third-party red teamer, third-party tester, third-party evaluator for models, that was totally divorced from any sort of conflict of interest."
Tang frames safety as "trust -- ensuring AI systems perform reliably as intended" and argues that existing safety approaches are "too heavy-handed and too general," advocating instead for context-specific, domain-customized safety testing. The company's tagline has evolved from "Bad day to be a language model" (launch, June 2024) to "Deploy 99.9% Reliable AI" (current), reflecting a shift from adversarial safety testing toward enterprise reliability.
The broader commercial strategy, articulated by an external analyst, is to "use safety testing as the entry point to work with large enterprises and open the door to the real opportunity -- becoming the ultimate LLM reliability platform."
What They Do
Red-teaming frontier models. Red-teamed OpenAI o1 (one month, August-September 2024, automated pipelines) and Anthropic Constitutional Classifiers prototypes (estimated 3,000+ hours across Haize, Gray Swan, and UK AISI). Tang confirmed being paid by OpenAI for this work.
Original attack research. Developed ACG (38x speedup over GCG for adversarial attacks), bijection learning (ICLR; 86.3% attack success rate on Claude 3.5 Sonnet; grows STRONGER with model scale), Cascade (automated multi-turn jailbreaking), string compositions, and DSPy red-teaming (44% attack success rate, 4x baseline).
Evaluation tools. Created Verdict (open-source judge-time compute scaling; GPT-4o+Verdict beats GPT-4o by 14.5%) and the Red-Teaming Resistance Leaderboard with HuggingFace.
Enterprise products. Suite includes Judges (evaluation), Haize (red-teaming), Monitor (production monitoring), Robustify (remediation), and Sphynx (hallucination testing).
The inverse scaling finding is their most scientifically significant result. Bijection learning demonstrates that more capable models are MORE vulnerable to this attack class, not less. Tang: "The bigger and more capable the model is, the more vulnerable it is to our bijection learning attack." This directly challenges the narrative that scaling capabilities improves safety.
Key People
Leonard Tang (Co-Founder, CEO, ~24 years old). Harvard CS+Math '23. Turned down Stanford PhD. 1,004 Google Scholar citations. Internships at NVIDIA, Snap, Cubist Systematic Strategies (quant research). Co-authored with Dan Hendrycks and worked loosely with Andy Zou (Gray Swan). Forbes 30 Under 30 AI (2025). Strong research pedigree but no prior startup or management experience. Public statements frame safety as reliability/trust rather than existential risk.
Co-founders Steve Li (Harvard, BAIR researcher, Roblox/Instagram) and Richard Liu (Harvard; almost no public information) round out the founding team. Nimit Kalra (research scientist, former quant, co-author on Verdict and Constitutional Classifiers papers) departed and is now working independently. Professor He He (NYU) joined as advisor in October 2025.
Team size: ~10-14 employees in New York City (74 Broad Street, Lower Manhattan).
Money and Incentives
Total funding: $12.5M seed round. Initial ~$7M closed August 2024, led by General Catalyst, extended to $12.5M by December 2025. Post-money valuation: $100M, achieved seven months after founding. "Investors reportedly competing on price."
Legal structure: For-profit C-Corp. Zero philanthropic grants. Zero Coefficient Giving funding. No 990 data. Purely VC-funded. Legal obligation is to maximize shareholder value, not to pursue a safety mission.
Revenue: Multi-million dollar contracts with frontier labs (OpenAI, Anthropic, Scale AI, AI21 Labs) and enterprises (Deloitte, MongoDB). No specific revenue figures disclosed.
Investors: General Catalyst, Soma Capital, AI Grant, Stage2 Capital, Pear VC, and others. Angel investors include founders/CEOs of Replit, Cognition (Devin), Pika, HuggingFace, and Okta. Several angels run AI companies that could be Haize customers, creating undisclosed conflicts.
Incentive misalignment (critical): Haize claims to be "totally divorced from any sort of conflict of interest," but is paid by the labs it evaluates. Tang confirmed Haize was paid by OpenAI for o1 red-teaming. Tang also admitted: "We actually have no idea what our testing led into on the OpenAI side. For all we know, we could have just been a final eval dataset... or our data could have been directly baked into the post-training process. We actually have no idea." The labs control what findings are disclosed publicly (via NDAs) and whether findings influence deployment decisions. Haize has no mechanism to compel action on its findings.
Business trajectory: The company appears to be pivoting from pure red-teaming (safety-focused) toward a broader enterprise reliability platform (commercially focused). This follows the standard startup pattern of landing with a differentiated product then expanding into a larger market.
What Others Say
"Red-Teaming for GenAI: Silver Bullet or Security Theater?" (2024, academic survey): "Prior methods and practices of AI red-teaming diverge along several axes... gestures towards red-teaming as a panacea for every possible risk verge on security theater." Most critically: in no case analyzed did red-teaming result in a decision not to release a model. The paper does not name Haize but the critique applies directly to their field.
"AI red-teaming is a sociotechnical problem" (CACM): Argues commercial red teams face structural conflicts of interest, NDA constraints limit public disclosure, and the comparison to content moderation's well-documented problems is instructive. Notes that when labor is outsourced, "it becomes harder to trace and trickier to assert labor protections for it."
TechPolicy Press: Questions whether red-teaming "genuinely interrogates system risk or primarily fulfills externally imposed accountability rituals."
AngelsRound (bull case): "Addressing a rapidly growing market with regulatory tailwinds. Strong early revenue with multiple high-value contracts."
AngelsRound (bear case): "Small team, reliance on narrow set of high-value contracts, competing against AI companies that might develop in-house solutions."
Nathan Labenz (Cognitive Revolution): "I think we're headed in [the direction of truly independent testing], ultimately, and probably for good reason" -- implicitly acknowledging that current commercial red-teaming, including Haize, is not truly independent.
No prominent external voices have directly criticized Haize Labs as a specific company. All field-level criticism applies to them structurally.
What's Absent
- Tang's personal views on existential risk. His "Crystallizing AI Risk" blog post exists but is inaccessible (JS-rendered site). All public statements frame safety as "trust" and "reliability," never as existential risk or catastrophic risk reduction.
- Zero engagement with the AI safety research community. No EA Forum, LessWrong, or Alignment Forum presence. No Coefficient Giving funding. Not part of the AI safety ecosystem in any traditional sense.
- No whistleblower mechanism or dangerous-capability disclosure protocol. Tang was asked directly and acknowledged the gap.
- No governance documentation. No board composition, conflict of interest policy, or recusal procedures publicly available.
- No evidence their findings have ever changed a deployment decision. Tang admits they don't know how their findings are used.
- No policy engagement. No lobbying, testimony, or regulatory participation despite operating in a space where regulation would benefit them.
- Richard Liu (co-founder) has almost no public presence despite being a Forbes 30 Under 30 honoree.
Recommended Reading
Cognitive Revolution Podcast: "Red Teaming o1" (Sep 2024) -- The most candid source. Tang, Ewart, and Huang discuss founding motivation, attack methods, o1 red-teaming experience, evaluation limitations, and the inverse scaling finding. Key moment: Tang admits they have no idea how their findings are used. https://www.cognitiverevolution.ai/red-teaming-o1-part-1-2-automated-jailbreaking-w-haize-labs-leonard-tang-aidan-ewart-brian-huang/
"Red-Teaming for GenAI: Silver Bullet or Security Theater?" (arXiv, 2024) -- The strongest intellectual challenge to Haize's entire field. Finds red-teaming definitions are vague, methods diverge wildly, and no case resulted in a decision not to release a model. https://arxiv.org/html/2401.15897v3
Sherwood News: "The crash test dummies for new AI models" (Oct 2024) -- Best critical journalism on the field. Surfaces the independence paradox directly from Tang's own quotes. https://sherwood.news/tech/ai-regulation-red-teaming-model-safety-checks/
"Endless Jailbreaks with Bijection Learning" (ICLR) -- Haize's most important paper. The finding that bigger models are more vulnerable has deep implications. https://arxiv.org/html/2410.01294v1