Distinguishing Right from Wrong in Debates: Attribution Analysis of Chinese Harmful Memes

arXiv cs.CL·Weiming Wang, Junyu Lu, Han Wang, Xiaokun Zhang, Zewen Bai, Bo Xu, Liang Yang, Hongfei Lin

4d ago

·~2 min·5/26/2026·en·2

Quick Take

This study introduces Ex-ToxiCN-MM, the first dataset for detecting harmful Chinese memes, addressing cultural context and semantic ambiguity. Utilizing the RIKE framework with AKE and RIR modules, it outperforms mainstream models in attribution tasks, demonstrating significant advancements in meme analysis.

Key Points

Ex-ToxiCN-MM dataset provides harmful and non-harmful interpretations for Chinese memes.
RIKE framework enhances meme attribution with AKE and RIR modules.
Method outperforms baseline models in quantitative and qualitative metrics.
C-HarmKB supplies essential cultural knowledge for improved meme analysis.
Open-sourced resources available at GitHub for further research.

Article Content

From source RSS / original summary

arXiv:2605. 24344v1 Announce Type: new Abstract: Research on harmful meme detection has garnered significant attention, resulting in the development of numerous datasets and methods. However, progress in detecting Chinese harmful memes lags considerably, primarily due to two challenges: first, accurately assessing a meme's harmfulness depends heavily on understanding deep cultural context; second, many memes are semantically ambiguous, making harmfulness highly subjective.

To address these issues, we focus on the interpretable detection of Chinese harmful memes by constructing the first Chinese harmful meme explanation dataset, Ex-ToxiCN-MM. This dataset offers opposing interpretations, categorized as "harmful" and "non-harmful", for each meme, aiming to rigorously evaluate a model's ability to discern and comprehend ambiguous, culturally grounded content.

We built a specialized knowledge base of Chinese cultural concepts and offensive vocabulary to supply models with essential prior knowledge (C-HarmKB). To address the ambiguity and lack of background knowledge in meme attribution, we have developed a comprehensive attribution analysis framework, RIKE, which includes an Attribution Knowledge Enhancement module (AKE) and a Relative Intent Reasoning module (RIR).

Extensive quantitative and qualitative experiments demonstrate that our method outperforms mainstream baseline models across multiple metrics in the task of attributing harmful memes in Chinese. The code, Ex-ToxiCN-MM dataset, and Chinese Harmful Semantic Knowledge Base (C-HarmKB) involved in this study have been open-sourced at https://github. com/wimiw123/Ex-ToxiCN-MM

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

Distinguishing Right from Wrong in Debates: Attribution Analysis of Chinese Harmful Memes

Quick Take

Key Points

Article Content

Want this in your inbox every morning?

More from arXiv cs.CL

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

What are They Thinking? Delineation, Probing and Tracking of Concepts in LLMs

In-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent Perspective