The Structural Attention Tax: How Retrieval Format Hijacks In-Context Learning Independent of Content

arXiv cs.CL·Yuqi Zhang, Di Zhang

6/11/2026

·~2 min·6/11/2026·en·0

Quick Answer

The study identifies a 'structural attention tax' in retrieval-augmented generation systems, where knowledge graph triples capture 2-3x more attention than equivalent natural text, compressing demonstration attention by up to 42%.

Quick Take

This effect is independent of content relevance and highlights the need for optimizing retrieval quality and reducing format-driven attention capture, as evidenced by a significant performance gap in task-matched retrieval strategies across models like Mistral-7B and LLaMA-3-8B.

Key Points

Knowledge graph triples capture 2-3x more attention than natural language text.
Demonstration attention can be compressed by up to 42% due to structural factors.
Task-matched BM25 retrieval outperforms ConceptNet by over 30 percentage points.
Five structure-aware mitigation strategies were derived, including prompt modifications.
Format flattening was validated through accuracy and attention-level evidence.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Source Excerpt

arXiv:2606. 11198v1 Announce Type: new Abstract: (RAG) systems inject external knowledge to improve outputs, yet the format of injected content -- distinct from its semantic relevance -- can independently distort the model's attention distribution.

We identify and formalise a phenomenon we term the structural attention tax: knowledge graph (KG) triples, due to their relational delimiters and repeated slot patterns, capture 2-3x more attention per token than semantically equivalent natural-language text ($\hat{o}$(KG) $\approx$ 0. 70 vs. …

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Isabel Xu (The Overlake School), Cynthia Xu (The Overlake School), Rachel Ren (Edwards Vacuum Inc.), Cong Guo (The University of Memphis), Jiacheng Ding (The University of Memphis)

5d ago

FeaturedOriginal

TriAgent: Divergence-Aware Committees for Cost-Efficient Financial Sentiment Analysis

AI Summary

TriAgent introduces a cost-efficient multi-agent system for financial sentiment analysis, combining VADER, FinBERT, and Qwen2.5. It achieves an F1 score of ~0.87 with significant savings of $9.3M/year at a 10M-user scale compared to GPT-4o-mini, while also detecting hallucinations with an AUC of 0.90.

#LLM #Agent #AI Startup #Enterprise AI

The Structural Attention Tax: How Retrieval Format Hijacks In-Context Learning Independent of Content

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.CL

TriAgent: Divergence-Aware Committees for Cost-Efficient Financial Sentiment Analysis

RF-Agent: A Practical Framework for Building Language Agents for RFIC Design

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.CL

TriAgent: Divergence-Aware Multi-Agent Committees for Cost-Efficient Financial Sentiment Analysis

RF-Agent: A Practical Framework for Building Language Agents for RFIC Design

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

TriAgent: Divergence-Aware Committees for Cost-Efficient Financial Sentiment Analysis