Detecting AI-Generated Content on Social Media with Multi-modal Language Models
Quick Answer
This paper shows that A new multi-modal vision-language model effectively detects AI-generated content on social media, achieving state-of-the-art performance on public benchmarks.
Quick Take
A new multi-modal vision-language model effectively detects AI-generated content on social media, achieving state-of-the-art performance on public benchmarks. The model enhances user engagement through post recommendations and addresses challenges like poor generalization and lack of interpretability in existing detection methods.
Key Points
- Model achieves state-of-the-art detection performance on public benchmarks.
- Addresses challenges like poor generalization and reliance on single modalities.
- Demonstrates robust detection capabilities across multiple social media platforms.
- Positively impacts user engagement through effective post recommendations.
- Continuously curates diverse multi-modal social media data for training.
Paper Resources
Article Excerpt
From source RSS / original summaryarXiv:2606. 11200v1 Announce Type: new Abstract: Generative AI has enabled the creation of photorealistic images and videos that are increasingly disseminated on social media, often used for spam, misinformation, manipulation, and fraud. Existing AI-generated content (AIGC) detection methods face challenges including poor generalization to new generation models, reliance on single modalities, and lack of interpretable explanations.
We present our pipeline that mitigates these issues by continuously curating diverse multi-modal social media data and training a compact vision-language model for detection and explanation. Our model achieves state-of-the-art detection performance on public benchmarks and demonstrates robust detection and explanation capabilities on internal social media datasets across multiple platforms.
We deployed our model for post recommendation on social media platforms and observed positive downstream impacts on user engagement, demonstrating that it is feasible to perform effective AIGC detection in dynamic, real-world social media environments.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.