Causal Evidence for Attention Head Imbalance in Modality Conflict Hallucination

arXiv cs.AI·Jinrui Jiang, Zhangtai Wu, Zhen Wu, Xinyu Dai

17h ago

·~2 min·5/20/2026·en·0

Quick Take

Study reveals attention head imbalance in MLLMs leads to modality-conflict hallucinations.

Key Points

Identifies hallucination-driving and resisting attention heads.
Driving heads are more distributed, resisting heads are localized.
MACI intervention reduces hallucinations effectively during generation.

📖 Reader Mode

~2 min read

[Submitted on 19 May 2026]

View PDF HTML (experimental)

Abstract:Modality-conflict hallucination occurs when multimodal large language models (MLLMs) prioritize erroneous textual premises over contradictory visual evidence. To understand why visual evidence fails to prevail during generation, we take a mechanistic perspective and examine which internal components drive or resist this failure. We perform head-level causal analysis using path patching across five open-source MLLMs and identify two groups of attention heads with opposing causal roles: hallucination-driving heads and hallucination-resisting heads. We find a consistent asymmetry: driving effects are more broadly distributed and carry greater aggregate weight, whereas resisting effects concentrate in a small number of high-importance heads. Ablation experiments further confirm that these groups exert opposing effects during generation: distributed driving influence and localized resistance together form an imbalanced routing structure that biases generation toward the erroneous premise. Motivated by this finding, we propose MACI (Modality-conflict-Aware Causal Intervention), a conditional intervention that suppresses causally identified hallucination-driving heads only when conflict is detected. Across five MLLMs, MACI achieves the largest hallucination reduction among compared inference-time baselines on the MMMC benchmark with a favorable hallucination-accuracy trade-off, and transfers zero-shot to the SCI-SemanticConflict test.

Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2605.19250 [cs.AI]
	(or arXiv:2605.19250v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2605.19250 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Jinrui Jiang [view email]
[v1] Tue, 19 May 2026 01:47:53 UTC (311 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Causal Evidence for Attention Head Imbalance in Modality Conflict Hallucination

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.AI

From Prompts to Protocols: An AI Agent for Laboratory Automation

Agentic Trading: When LLM Agents Meet Financial Markets

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems

Related in this space

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Verifiable Agentic Infrastructure: Proof-Derived Authorization for Sovereign AI Systems

MedFM-Robust: Benchmarking Robustness of Medical Foundation Models