HierBias: Context-Conditioned Hierarchical Media Bias Detection with Multi-Task Type Classification

arXiv cs.CL·Kaining Li, Ruichen Yan, Yuxin Dong

5d ago

·~2 min·6/26/2026·en·0

Quick Answer

HierBias is a hierarchical media bias detector that improves sentence-level classification by incorporating document context, achieving 0.853 F1 and 0.723 MCC on BABE and BASIL, outperforming existing models by 2.6% F1 and 4.3% MCC.

Quick Take

Key Points

HierBias uses context-conditioned bias probability for improved bias detection.
The model combines sentence-level RoBERTa with a cross-sentence Transformer.
Achieved 0.853 F1 and 0.723 MCC on benchmark datasets BABE and BASIL.
Outperformed the state-of-the-art by 2.6% F1 and 4.3% MCC.
Joint training of bias detection and type classification enhances sample efficiency.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

📖 Reader Mode

~2 min read

[Submitted on 29 Apr 2026]

View PDF HTML (experimental)

Abstract:Media bias detection is a critical task for ensuring fair and balanced information dissemination, yet existing sentence-level approaches classify each sentence independently, ignoring inter-sentence contextual signals that human annotators naturally exploit. We present \textbf{HierBias}, a hierarchical context-conditioned media bias detector that formally models document context in bias prediction. We introduce the \emph{context-conditioned bias probability} and prove theoretically that leveraging document context strictly reduces the Bayes error of sentence-level classification when inter-sentence mutual information is non-zero. A multi-task generalization bound further establishes that jointly training binary bias detection and fine-grained bias type classification improves sample efficiency on small annotated corpora. Architecturally, HierBias pairs a sentence-level RoBERTa encoder with a cross-sentence Transformer aggregator and dual output heads for binary detection and four-class type classification. Evaluated on BABE and BASIL, HierBias achieves 0.853 F1 and 0.723 MCC, surpassing the state-of-the-art bias-detector by $+2.6\%$ F1 and $+4.3\%$ MCC (McNemar's test, $p < 0.05$). Ablation experiments confirm that each theoretical component contributes independently and consistently.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2606.26100 [cs.CL]
	(or arXiv:2606.26100v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2606.26100 arXiv-issued DOI via DataCite

Submission history

From: Kaining Li [view email]
[v1] Wed, 29 Apr 2026 18:33:42 UTC (86 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Barak Or

1w ago

FeaturedOriginal

Quantifying Prior Dominance in Systems

AI Summary

The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.

#LLM #AI Coding #Inference #AI Startup

HierBias: Context-Conditioned Hierarchical Media Bias Detection with Multi-Task Type Classification

Quick Answer

Quick Take

Key Points

Paper Resources

📖 Reader Mode

Submission history

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quick Answer

Quick Take

Key Points

Paper Resources

📖 Reader Mode

Submission history

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in RAG Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quantifying Prior Dominance in Systems