The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs
Quick Take
The paper introduces CLEAR, a method for optimal budget allocation in LLMs, improving global accuracy by up to 3x in resource-scarce scenarios. By reallocating resources from insolvent to solvable queries, CLEAR enhances the Pareto frontier of token cost versus accuracy in reasoning tasks.
Key Points
- CLEAR reallocates resources from insolvent queries to those near solvable thresholds.
- The method significantly improves the Pareto frontier of token cost versus accuracy.
- In experiments, CLEAR achieved up to a 3x improvement in global accuracy.
- The approach is grounded in economic principles of constrained optimization.
- Utilizes a shifted-surge function to model per-query reasoning utility.
Article Excerpt
From source RSS / original summaryarXiv:2606. 03092v1 Announce Type: new Abstract: Inference-time scaling has emerged as a critical avenue for enhancing Large Language Models' performance, yet real-world deployment is constrained by strict computational budgets. In this work, we formulate inference budget allocation as a global constrained optimization problem governed by economic principles.
By modeling per-query reasoning utility with a shifted-surge function, we derive an optimal allocation policy based on a global shadow price that equilibrates marginal utility under resource scarcity. Based on this theory, we propose Constrained Latent-utility Equilibrium Allocation for Reasoning (CLEAR). It performs rational abandonment and reallocates resources from insolvent queries to solvable queries near their emergence thresholds.
Extensive experiments on several reasoning tasks with different traffic streams demonstrate that CLEAR significantly improves the Pareto frontier of total token cost versus mean accuracy. In resource-scarce regimes, CLEAR achieves up to a 3x improvement in global accuracy compared to uniform allocation.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification
AuditFlow introduces a multi-agent framework for structured financial reporting verification, achieving 82.09% accuracy with GPT-5.5, outperforming the baseline by 14.93 points. It utilizes a symbolic environment for effective audit processes, demonstrating the necessity of deterministic checks for reliable verification.