When Planning Fails Despite Correct Execution: On Epistemic Calibration for LLM-Based Multi-Agent Systems

arXiv cs.AI·Zehao Wang, Shilong Jin, Zhao Cao, Lanjun Wang

5/25/2026

·~1 min·5/25/2026·en·3

Quick Answer

The study introduces the Epistemic Planning Calibration Agentic Workflow (EPC-AW) to mitigate epistemic miscalibration in LLM-based multi-agent systems, improving system-level success by 9.75%.

Quick Take

The study introduces the Epistemic Planning Calibration Agentic Workflow (EPC-AW) to mitigate epistemic miscalibration in LLM-based , improving system-level success by 9.75%. This approach focuses on assessing plan support under varying information conditions rather than direct feasibility verification, addressing latent misjudgments that can affect planning outcomes.

Key Points

EPC-AW addresses latent epistemic miscalibration in planning for LLM-based systems.
Implements Information-consistency-based Plan Selection for stable evaluations across agents.
Utilizes Consistency-guided Epistemic State Refinement to adapt calibration over time.
Experiments show an average improvement of 9.75% in system-level success.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Excerpt

From source RSS / original summary

arXiv:2605. 23414v1 Announce Type: new Abstract: LLM-based can fail even when planned actions are executed correctly because agents may misjudge their knowledge when evaluating plan feasibility, a phenomenon we term epistemic miscalibration in planning.

Unlike execution errors, epistemic miscalibration is latent during planning, as generated plans can remain self-consistent and executable without observable errors; the miscalibration is also dynamic, as new information can alter feasibility assessments, potentially obscuring past miscalibration signals and causing them to recur over time.

To address this, we propose the Epistemic Planning Calibration Agentic Workflow (EPC-AW), which assesses whether plans remain supported under varying information conditions rather than directly verifying feasibility. EPC-AW employs Information-consistency-based Plan Selection, selecting plans whose evaluations are stable across agents, together with Consistency-guided Epistemic State Refinement to adapt calibration over time by leveraging past discrepancies to guide future planning.

Experiments show that EPC-AW improves system-level success by an average of 9. 75%.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Mihnea C. Moldoveanu, Joel A. C. Baum

4d ago

FeaturedOriginal

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

AI Summary

The paper introduces Adversarial Social Epistemology (ASE) to analyze how agents manipulate trust in public communications, highlighting mechanisms that undermine the reliability of testimony and inference. It critiques existing frameworks like epistemic bubbles and misinformation diffusion, proposing a new language for understanding trust breaches and auditing inferential chains in densely interactive environments involving humans and large language models.

#LLM #Agent #Inference #Policy

When Planning Fails Despite Correct Execution: On Epistemic Calibration for LLM-Based Multi-Agent Systems

Quick Answer

Quick Take

Key Points

Paper Resources

Article Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

Information Limits and Attractor Dynamics in Economies of Frontier LLM Agents: A Pre-Registered Test

Onnes: A Physics-Grounded LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure

Quick Answer

Quick Take

Key Points

Paper Resources

Article Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

Information Limits and Attractor Dynamics in Economies of Frontier LLM Agents: A Pre-Registered Test

Onnes: A Physics-Grounded Multi-Agent LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure

Onnes: A Physics-Grounded LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure