Theory of Change
Metaculus's stated theory of change is that aggregating probabilistic predictions from a community of forecasters produces accurate foresight that improves decision-making on globally important topics -- AI timelines, biosecurity, geopolitics, climate. The platform creates "epistemic infrastructure" by defining precise questions, soliciting probability estimates, and aggregating them into community predictions that outperform individual experts.
However, the founder tells a different story. Anthony Aguirre (who started Metaculus simultaneously with FLI to serve FLI's planning needs) says it has been "surprisingly little" use:
"Once you've really carefully defined a question... whether that thing is 70% likely or 80% likely, nobody cares. It barely ever matters... But the process of getting to the point of knowing exactly what X is and having a well-defined question and keeping track of who makes good predictions and who doesn't... those things are really valuable." (AXRP, Feb 2025)
Current CEO Deger Turan (since April 2024) has reframed the theory around "epistemic security" -- using forecasting not just for prediction accuracy but as a tool for building shared world models, conditional policy analysis, and democratic deliberation. He acknowledges: "We already have really accurate forecasts, and we haven't had the Cambrian explosion of every single entity, corporation, government perpetually using forecasts, because the framing of the forecasts has not proven to be helpful just yet."
What They Do
Metaculus operates an online crowd forecasting platform (not a prediction market -- no money changes hands on predictions). Users submit probability estimates on precisely defined, resolvable questions. These are aggregated into a community prediction using recency-weighted medians. The platform has 2.9M+ predictions across 21K+ questions, with 9K+ resolved. All users are scored on calibration and accuracy via logarithmic scoring rules.
Key activities and outputs:
- AI Forecasting Benchmark Series: Year-long tournament comparing AI bots vs human forecasters. $175K in prizes. Results show AI rapidly catching up -- in Summer 2025, Mantic (AI bot) placed 8th out of 549 forecasters, the first bot in the top 10.
- FutureEval (Feb 2026): Benchmark projecting AI will pass community forecasters by April 2026 and pro forecasters by mid-2027.
- Pro Forecaster services: Top 2% of users are hired for paid private forecasting for clients including Bridgewater Associates, CDC, GiveWell, FAS, IST.
- Policy partnerships: FAS Climate Tipping Points Tournament, IST AI + Nuclear Landscape forecasting, CDC FluSight (3 years running).
- Open source: Entire codebase released June 2024 under BSD-2-Clause license. A third of contributors are now external.
Track record highlights: predicted Ukraine invasion 2 weeks early at high probability; COVID death forecasts with 12.2% error rate; outperformed FiveThirtyEight and all prediction markets on 2022 midterms (though this is "basically just one data point" per the analyst who measured it).
Key People
Anthony Aguirre -- Founder and Chairman. Theoretical physicist (UC Santa Cruz), co-founder and Executive Director of Future of Life Institute. Started Metaculus and FLI simultaneously around 2015. Now "pretty much full-time" at FLI, making him a part-time Chairman of Metaculus.
Deger Turan -- CEO since April 2024. Background in collective intelligence and NLP. Previously president of AI Objectives Institute (sociotechnical alignment nonprofit). Founded Cerebra Technologies (discourse analysis for 300M+ people). His vision emphasizes "epistemic security," conditional forecasts, and "mini-Metaculuses" (focused forecasting instances).
Gaia Dempsey -- Former CEO, now Special Advisor and board member. Oversaw the major Open Phil grants and PBC conversion. Transitioned to advisory role when Turan joined.
Team size is approximately 28 people, fully remote across 4 continents.
Money and Incentives
Total known funding: ~$12.54M
| Source | Amount | Dates |
|---|---|---|
| Open Philanthropy/Coefficient Giving | $11,882,400 | 2019-2024 (5 grants) |
| EA Infrastructure Fund | $308,043 | 2021 |
| Metaplanet | $175,000 | 2024 |
| NSF | $150,000 | 2023 |
| FTX Future Fund | $20,000 | 2022 |
| Survival and Flourishing Fund | $750,000 | 2022-2024 |
Single-funder concentration: Open Philanthropy provides ~87% of known grant funding. SFF's $750K provides some diversification but CG dominance is extreme. This is extreme dependency. If Open Phil shifts priorities, Metaculus faces an existential funding crisis. The three largest Open Phil grants ($5.5M, $3M, $2.75M) are all for "Platform Development."
Legal structure: Public Benefit Corporation (Delaware) since September 2022. As a PBC (for-profit), Metaculus has no obligation to publish financial data. No 990 filings exist. No public annual reports.
Revenue model: Estimated $3.9M/year total revenue (third-party estimate). Revenue breakdown between grants and commercial services (Pro Forecaster engagements, tournament hosting) is unknown. Enterprise clients include Bridgewater, CDC, GiveWell, FAS, IST.
Equity structure: Unknown. The PBC structure allows equity investors. No information about whether grants came with equity, or who holds ownership stakes.
Key incentive tensions: (1) Open Phil funds Metaculus as infrastructure for its own decisions, creating a principal-agent dynamic. (2) PBC structure means philanthropic funding could create value that accrues to equity holders rather than the mission. (3) AI lab credit donations for tournaments create soft dependencies on labs whose systems are being benchmarked.
What Others Say
Strongest criticism (scoring system): Ross Rheingans-Yoo showed mathematically that Metaculus's scoring creates "no-information arbitrage" -- users can earn points without adding useful information by predicting the median. The community-relative scoring component "actively incentivizes gaming." This was prompted by Zvi Mowshowitz creating an account and immediately quitting. Nuno Sempere confirmed these incentive problems persist in an arXiv paper.
Strongest criticism (decision-relevance): Sempere estimated the value of Metaculus questions at ~$225K/year. Most questions "fail to directly influence decisions." Many are either too narrow to matter or too large-scope to be influenceable.
Strongest criticism (x-risk forecasting): Normaltech argues existential risk forecasts are "too unreliable to inform policy." Subjective probabilities "vary by orders of magnitude" between experts. There is no way to measure forecaster skill on unique/rare events. Even the best scoring rules require "a hundred million" forecasts to detect systematic overestimation of 1% tail risks. Forecasters are essentially making up numbers.
Strongest criticism (AI predictions): Rethink Priorities found Metaculus AI forecasters were overconfident on numeric questions (2/16 resolved within the 25-75% confidence interval) and slightly biased toward predicting faster progress than occurred.
Defense: Metaculus demonstrably outperforms prediction markets and individual experts on many questions. Its COVID and Ukraine forecasts were genuinely useful. The reputation-based system solves the "honest forecasting" problem without financial stakes. And the 2022 midterm performance, while one data point, showed meaningful advantage over FiveThirtyEight and all prediction markets.
What's Absent
- No public financial statements or annual reports despite $12M+ in philanthropic funding.
- No documented case of an institution changing a policy or allocation based on a Metaculus forecast (only individual anecdotes).
- No public response to the Rheingans-Yoo scoring system critique.
- No information about equity structure or ownership.
- No independent board members confirmed -- all known board members are insiders.
- No systematic measurement of forecasting impact (an organization that quantifies everything has never quantified its own impact).
- Progress on the "mini-Metaculus" vision (central to Turan's strategy) is unknown 1.5+ years after announcement.
- User demographics and active user counts are unpublished -- fewer than 1,500 people have individually forecast on the most prominent AGI question.
Recommended Reading
Anthony Aguirre on AXRP (Feb 2025) -- Most candid source. Founder admits predictions are "surprisingly little" use to FLI. The value is in question definition, not the numbers themselves. https://axrp.net/episode/2025/02/09/episode-38_7-anthony-aguirre-future-of-life-institute.html
Ross Rheingans-Yoo: "Metaculus has some issues" (Feb 2021) -- Strongest substantive criticism. Mathematical proof that the scoring system incentivizes gaming rather than information injection. https://blog.rossry.net/metaculus/
Deger Turan on Cognitive Revolution (Aug 2024) -- 2-hour interview with the new CEO. Full vision for "epistemic security" and the strategic pivot toward decision-relevance. https://www.cognitiverevolution.ai/scaling-superforecasting-ai-forecasting-tournaments-road-to-epistemic-security-with-deger-turan/
Normaltech: "AI existential risk probabilities are too unreliable to inform policy" (Jul 2024) -- Fundamental critique of using crowd forecasts for policy on unprecedented events. Directly challenges Metaculus's most visible use case. https://www.normaltech.ai/p/ai-existential-risk-probabilities
Nuno Sempere: "An Estimate of the Value of Metaculus Questions" (Oct 2021) -- Attempts to quantify decision-relevance. The $225K/year estimate vs. $12M+ in funding is illuminating. https://nunosempere.com/blog/2021/10/22/an-estimate-of-the-value-of-metaculus-questions/