Multilingual Polarization Detection Using Transformer-Based Models with Class Weighting and Threshold Tuning
Quick Answer
This study presents a transformer-based approach for multilingual polarization detection, achieving F1 macro scores of 0.7901 for English and 0.7910 for Swahili in binary detection.
Quick Take
This study presents a transformer-based approach for multilingual polarization detection, achieving F1 macro scores of 0.7901 for English and 0.7910 for Swahili in binary detection. The method employs class-weighted loss functions and threshold tuning to address label imbalance, demonstrating competitive performance in the SemEval-2026 Task 9 leaderboard.
Key Points
- Utilizes RoBERTa-base for English and AfroXLMR-base for Swahili.
- Achieved F1 scores of 0.7901 (English) and 0.7910 (Swahili) for binary detection.
- Class-weighted loss functions help address severe label imbalance.
- Error analysis indicates challenges in detecting dehumanization and empathy.
- Demonstrates effectiveness in handling multi-label polarization detection.
Paper Resources
Article Excerpt
From source RSS / original summaryarXiv:2606. 30857v1 Announce Type: new Abstract: This paper describes our submission to SemEval-2026 Task 9 on detecting multilingual, multicultural, and multievent online polarization. We address all three subtasks: binary polarization detection, polarization type classification, and manifestation identification for English and Swahili.
Our approach leverages transformer-based models (RoBERTa-base for English, AfroXLMR-base for Swahili) with class-weighted loss functions to address severe label imbalance and per-label threshold tuning to optimize multi-label classification. On the test set, we achieve F1 macro scores of 0. 7901 (English) and 0. 7910 (Swahili) for Subtask 1, 0. 4615 (English) and 0. 4808 (Swahili) for Subtask 2 and 0. 4791 (English) and 0.
5830 (Swahili) for Subtask 3, which give competitive performance on the leaderboard, demonstrating the effectiveness of our methods for handling imbalanced multi-label polarization detection. Our error analysis reveals that models struggle with dehumanization detection and lack of empathy.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Quantifying Prior Dominance in Systems
The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.