When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models
Quick Take
This study reveals that large language models (LLMs) like GPT-3 exhibit 'global narrative dominance,' favoring English over local contexts, particularly in Bangla. The introduction of the CulturalNB dataset highlights how English questions lead to reduced local perspective coverage and increased institutional bias, suggesting a need for better grounding in cultural contexts.
Key Points
- CulturalNB dataset includes 717 curated Bengali cultural instances with parallel question-answer pairs.
- Nine state-of-the-art LLMs were evaluated for cross-lingual consistency and bias metrics.
- English questions increased global substitution and institutional framing in responses.
- Local evidence improved factual consistency but did not eliminate epistemic shifts.
- Cultural failures in LLMs stem from narrative prioritization, not just missing knowledge.
Article Content
From source RSS / original summaryarXiv:2605. 30481v1 Announce Type: new Abstract: Large language models (LLMs) are widely used as cross-lingual knowledge interfaces. However, culturally grounded questions often reflect globally dominant narratives rather than local contexts. We study this failure mode as \textit{global narrative dominance} in Bangla, a low-resource cultural context.
We introduce \texttt{CulturalNB}, a dataset of 717 manually curated Bengali cultural instances with parallel Bangla--English question--answer pairs and supporting evidence, metadata, and sociocultural annotations. Using question-only and evidence-based prompting, we evaluate nine state-of-the-art LLMs with human and two independent LLM judges across metrics for cross-lingual consistency, language anchoring, global substitution, institutional bias, and epistemic perspective coverage.
Results show that questions asked in English systematically increase global substitution and institutional framing while reducing local perspective coverage. Local evidence improves factual consistency and perspective coverage, but does not eliminate language-induced epistemic shifts. These findings suggest that cultural failures in LLMs are not only missing-knowledge errors but also failures of grounding and narrative prioritization.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.