Faithful by Construction: Claim-Anchored Attribution for Multi-Document Summarization
Quick Answer
The CAMS framework enhances multi-document summarization by anchoring claims to source documents, improving attribution accuracy by two-thirds while maintaining summary quality.
Quick Take
The CAMS framework enhances multi-document summarization by anchoring claims to source documents, improving attribution accuracy by two-thirds while maintaining summary quality. It effectively addresses hallucination issues in LLMs, achieving better faithfulness and citation precision on benchmarks like MultiNews and DiverseSumm.
Key Points
- CAMS extracts atomic claims with token-level provenance from source documents.
- It clusters equivalent claims and flags inter-source conflicts for better accuracy.
- The framework improves multi-source attribution accuracy by approximately 66%.
- CAMS maintains summary quality while enhancing faithfulness and citation precision.
- The model uses a two-regime protocol for evaluating citation quality and localization.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 23989v1 Announce Type: new Abstract: End-to-end large language models (LLMs) produce fluent multi-document summaries but remain prone to hallucination, and the attributions they offer are typically coarse (whole documents or passages) and generated post hoc, leaving each summary statement hard to verify. We revisit the modular Extract--Select--Rewrite paradigm and recast its intermediate representation as the unit of attribution.
We present CAMS, a Claim-Anchored Multi-document Summarization framework that (i) extracts atomic claims with token-level provenance from every source document, (ii) clusters equivalent claims across documents while flagging inter-source conflicts, (iii) selects a support-aware and salient subset, and (iv) rewrites the selection into a summary in which every sentence is anchored to a support-checked claim that links back to one or more source spans.
Because content is localized before it is realized, the pipeline is attribution-oriented by construction and faithfulness-oriented by construction: it structurally preserves fine-grained, multi-source traceability while using support-aware selection, constrained rewriting, and verification to encourage, rather than guarantee, factual faithfulness.
We evaluate quality, faithfulness, and localization on MultiNews, analyze conflict handling on DiverseSumm, and test zero-shot transfer on WCEP, using a two-regime protocol that separates reference-free citation quality from gold-aligned localization accuracy, and we add an evaluator-decoupled audit that tests citation precision with a support model never used for selection or verification.
CAMS matches strong end-to-end and span-attribution baselines on summary quality while substantially improving faithfulness and citation precision, lifting multi-source attribution accuracy by roughly two-thirds, and exposing a controllable faithfulness--coverage trade-off that end-to-end models leave implicit.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Quantifying Prior Dominance in Systems
The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.