Diverse Evidence, Better Forecasts: Multi-Agent Deliberation Under Information Asymmetry
Quick Answer
The InfoDelphi framework enhances multi-agent forecasting by introducing information asymmetry, outperforming traditional models by 12-18% in Brier score on the PolyGym benchmark.
Quick Take
The InfoDelphi framework enhances forecasting by introducing information asymmetry, outperforming traditional models by 12-18% in Brier score on the PolyGym benchmark. This approach allows agents to hold exclusive knowledge, leading to improved accuracy and reduced error correlation, establishing input diversity as crucial for effective reasoning.
Key Points
- InfoDelphi combines relevance-aware evidence routing and confidence-weighted aggregation.
- Achieved 4-8 percentage points improvement in accuracy over single-agent baselines.
- The framework addresses the herding problem in multi-agent systems.
- Removing information asymmetry significantly reduces deliberation benefits.
- PolyGym benchmark consists of 375 binary forecasting questions from real markets.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2607. 01661v1 Announce Type: new Abstract: are increasingly used for forecasting future events, as deliberation among multiple LLMs is believed to improve reasoning and calibration. Yet existing approaches overlook a critical design choice: what information each agent receives. When all agents are given identical evidence, deliberation collapses into herding rather than genuine belief revision, leaving multi-agent systems little better than a single agent.
We identify this as a fundamental gap and propose designed information asymmetry to close it: by partitioning evidence into shared public and disjoint private subsets, each agent holds exclusive knowledge that can only reach others through deliberation. We theoretically show that this decomposition reduces inter-agent error correlation, and instantiate it in InfoDelphi, a framework combining relevance-aware evidence routing, rationale-based iterative deliberation, and confidence-weighted aggregation.
On PolyGym, a benchmark of 375 binary forecasting questions derived from real-world prediction markets, InfoDelphi outperforms the strongest single-agent and multi-agent baselines by 12--18% in Brier score and 4--8 percentage points in accuracy. More detailed experiments confirm that removing information asymmetry eliminates most deliberation gains, establishing diversity of input as the key enabler of effective multi-agent reasoning.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Procedural Memory Distillation: Online Reflection for Self-Improving Language Models
Procedural Memory Distillation (PMD) enhances reinforcement learning by converting cross-episode signals into reusable memory, improving Qwen3-8B and OLMo3-Instruct-7B models by 3.8-5.5% on SCIKNOWEVAL and 7.9-13.6% on . The co-evolution of policy and memory allows for more effective self-supervision, demonstrating significant performance gains when both components are active.