Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining

arXiv cs.CL·Ruixing Ren, Junhui Zhao, Fangfang Wang

2d ago

·~2 min·6/29/2026·en·0

Quick Answer

This paper introduces a dual-threshold hard example mining method to enhance cross-platform offensive comment detection in Chinese social media.

Quick Take

This paper introduces a dual-threshold hard example mining method to enhance cross-platform offensive comment detection in Chinese social media. By fine-tuning a clean-Chinese-base RoBERTa model on a three-class dataset from Weibo, Xiaohongshu, Tieba, and Zhihu, the approach significantly improves performance across platforms with minimal manual labeling required.

Key Points

Introduces dual-threshold hard example mining for offensive comment detection.
Fine-tunes clean-Chinese-base RoBERTa on a three-class dataset.
Quantifies domain distances using Jaccard and Proxy-A Distance metrics.
Achieves significant performance gains across Weibo, Xiaohongshu, Tieba, and Zhihu.
Requires minimal manual labeling for effective cross-platform adaptation.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

📖 Reader Mode

~2 min read

[Submitted on 26 Jun 2026]

View PDF HTML (experimental)

Abstract:Cross-platform deployment of offensive comment detection for Chinese social media suffers performance degradation. The paper proposes a dual-threshold hard mining method to address this. First, the clean-Chinese-base RoBERTa is finetuned on COLD to establish a binary baseline for fair comparison. Second, a three-class fine-labeled test set covering Weibo, Xiaohongshu, Tieba, and Zhihu is constructed, domain distances from the source are quantified using Jaccard and Proxy-A Distance, as well as the degradation bottleneck of the baseline under domain shift is systematically revealed. Herein, a dual threshold hard example mining strategy is proposed. High- and low-confidence error-prone samples are filtered from unlabeled corpora by prediction confidence. The model is secondarily finetuned under implicit contexts with merely a small set of manually labeled hard examples, realizing low-cost cross-platform domain adaptation. Experiments reveal significant performance gains of the optimized model across four platforms.

Comments:	10 pages, 7 figures
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
MSC classes:	68T50, 68U15, 91F10
ACM classes:	I.2.7; I.2.6; H.3.4
Cite as:	arXiv:2606.27629 [cs.CL]
	(or arXiv:2606.27629v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2606.27629 arXiv-issued DOI via DataCite

Submission history

From: Junhui Zhao [view email]
[v1] Fri, 26 Jun 2026 00:56:11 UTC (583 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Barak Or

1w ago

FeaturedOriginal

Quantifying Prior Dominance in Systems

AI Summary

The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.

#LLM #AI Coding #Inference #AI Startup

Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining

Quick Answer

Quick Take

Key Points

Paper Resources

📖 Reader Mode

Submission history

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quick Answer

Quick Take

Key Points

Paper Resources

📖 Reader Mode

Submission history

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in RAG Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quantifying Prior Dominance in Systems