PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow
Quick Answer
PathoSage introduces a three-stage framework for patch-level pathology reasoning, effectively reducing hallucinations and classifier disagreement.
Quick Take
PathoSage introduces a three-stage framework for patch-level pathology reasoning, effectively reducing hallucinations and classifier disagreement. Its Structured Evidence Deliberation component enhances decision-making by evaluating heterogeneous evidence and mitigating anchoring bias, outperforming existing MLLM and agentic systems in experiments.
Key Points
- PathoSage separates knowledge retrieval, evidence collection, and adjudication for improved reasoning.
- Structured Evidence Deliberation independently evaluates evidence, reducing anchoring bias.
- The framework outperforms existing MLLM and agentic baselines in experiments.
- Introduces a training-free Beta-Bernoulli experience system for tool reliability.
- Mitigates hallucinations in visual question answering (VQA) tasks.
Article Content
From source RSS / original summaryarXiv:2606. 07549v1 Announce Type: new Abstract: Recent advances in Multimodal Large Language Models (MLLMs) and agent workflows have shown strong promise for computational pathology, yet reliable patch-level reasoning remains challenging. End-to-end pathology MLLMs often hallucinate morphological features, while recent agentic systems usually merge tool outputs and retrieved knowledge into a shared context, making decisions vulnerable to conflicting evidence and context contamination.
We propose PathoSage, a three-stage framework that explicitly separates knowledge retrieval, evidence collection, and evidence adjudication for patch-level pathology multimodal reasoning. Its core component, Structured Evidence Deliberation, independently evaluates heterogeneous evidence from tools, performs conflict analysis, and generates the final judgment in a fresh context to reduce anchoring bias.
We further introduce a training-free Beta-Bernoulli experience system with continuous credit assignment to model long-term tool reliability and construct similarity-weighted priors for future . Experiments show that PathoSage effectively mitigates VQA hallucinations and classifier disagreement, outperforming strong pathology MLLM and agentic baselines. Our results highlight explicit evidence adjudication and reliability-aware tool modeling as key ingredients for robust pathology agents.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective
This paper addresses the sim-to-real gap for foundation model agents by framing it within a Markov Decision Process (MDP) structure. It advocates for established solutions like domain randomization to enhance agent robustness, aiming to create standardized benchmarks for reliable real-world applications.