
Only three AI models finished above starting capital in a 500-day startup survival test
Quick Answer
In a 500-day startup survival test, only three AI models managed to maintain their starting capital, while most went bankrupt.
Quick Take
In a 500-day startup survival test, only three AI models managed to maintain their starting capital, while most went bankrupt. Surprisingly, a simple rule-based heuristic outperformed nearly all AI models, highlighting significant limitations in current AI strategies for business management.
Key Points
- Princeton University's CEO-Bench tested AI agents running fictional software companies.
- Most AI models failed, with a rule-based heuristic outperforming them.
- Only three AI models finished above their starting capital in the test.
- The results indicate potential flaws in AI strategies for business operations.
- The study raises questions about the viability of AI in real-world startups.
Article Excerpt
From source RSS / original summaryResearchers at Princeton University built CEO-Bench, a test where AI agents have to run a fictional software company for 500 simulated days. Most current models go broke, and a simple rule-based heuristic with no AI beats nearly all of them. The article Only three AI models finished above starting capital in a 500-day startup survival test appeared first on The Decoder.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from The Decoder
See more →
An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run
Epoch AI's MirrorCode benchmark reveals Claude Opus 4.7 as the leader with a 56% solve rate, reconstructing a 16,000-line toolkit in 14 hours. Despite this, all models tested struggle with the most complex tasks, highlighting limitations in current AI capabilities. The single task consumed $2,600 over 19 days, raising questions about cost-effectiveness in AI development.

