When Planning Fails Despite Correct Execution: On Epistemic Calibration for LLM-Based Multi-Agent Systems
Quick Take
The study introduces the Epistemic Planning Calibration Agentic Workflow (EPC-AW) to mitigate epistemic miscalibration in LLM-based multi-agent systems, improving system-level success by 9.75%. This approach focuses on assessing plan support under varying information conditions rather than direct feasibility verification, addressing latent misjudgments that can affect planning outcomes.
Key Points
- EPC-AW addresses latent epistemic miscalibration in planning for LLM-based systems.
- Implements Information-consistency-based Plan Selection for stable evaluations across agents.
- Utilizes Consistency-guided Epistemic State Refinement to adapt calibration over time.
- Experiments show an average improvement of 9.75% in system-level success.
Article Excerpt
From source RSS / original summaryarXiv:2605. 23414v1 Announce Type: new Abstract: LLM-based multi-agent systems can fail even when planned actions are executed correctly because agents may misjudge their knowledge when evaluating plan feasibility, a phenomenon we term epistemic miscalibration in planning.
Unlike execution errors, epistemic miscalibration is latent during planning, as generated plans can remain self-consistent and executable without observable errors; the miscalibration is also dynamic, as new information can alter feasibility assessments, potentially obscuring past miscalibration signals and causing them to recur over time.
To address this, we propose the Epistemic Planning Calibration Agentic Workflow (EPC-AW), which assesses whether plans remain supported under varying information conditions rather than directly verifying feasibility. EPC-AW employs Information-consistency-based Plan Selection, selecting plans whose evaluations are stable across agents, together with Consistency-guided Epistemic State Refinement to adapt calibration over time by leveraging past discrepancies to guide future planning.
Experiments show that EPC-AW improves system-level success by an average of 9. 75%.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane
The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.