Yuvion VL: A Multimodal Foundation Model for Adversarial Content and AI Safety
Quick Answer
Yuvion VL is a multimodal foundation model designed for content and AI safety, achieving industry-leading performance with its 32B variant.
Quick Take
Yuvion VL is a multimodal foundation model designed for content and AI safety, achieving industry-leading performance with its 32B variant. It surpasses both open-source and closed-source models in safety tasks, utilizing a novel training pipeline and Confuse-then-Contrast Fine-Tuning for enhanced interpretability.
Key Points
- Yuvion VL addresses multimodal adversarial risks in AI safety.
- Utilizes a three-stage training pipeline for improved safety performance.
- Confuse-then-Contrast Fine-Tuning enhances model discrimination in safety tasks.
- Yuvion VL-32B outperforms comparable models in safety benchmarks.
- Introduces YVRE for rigorous evaluation of AI safety capabilities.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 25034v1 Announce Type: new Abstract: General-purpose models often struggle to reliably identify and understand real-world multimodal risks, largely due to the inherent multimodal adversarial nature of content and AI safety. We present Yuvion VL, a family of multimodal large language models purpose-built for content and AI safety, with both instruction-tuned and reasoning-oriented variants.
Yuvion VL addresses this gap by treating safety as an inherently adversarial and multimodal problem and designing the entire pipeline around adversarial robustness. For data construction, we develop an automated pipeline integrating adversarial-aware data synthesis with multi-stage quality control, producing large-scale, high-quality multimodal samples augmented with domain knowledge and reasoning annotations.
For training, we adopt a three-stage pipeline that includes continued pretraining for risk-concept cross-modal alignment, instruct post-training for production-grade safety tasks, and reasoning post-training for enhanced interpretability and performance in complex tasks.
We further introduce Confuse-then-Contrast Fine-Tuning, a contrastive framework that mines model-specific confusions and constructs multi-image contrastive groups to enforce explicit discrimination of fine-grained visual-semantic elements, enabling the model to distinguish between visually similar cases with different safety implications in adversarial safety tasks.
To support rigorous evaluation, we further introduce Yuvion VL RiskEval (YVRE), a collection of benchmarks covering diverse open and internal evaluations, with a focus on content and AI safety, adversarial robustness, and real-world capability requirements. Experiments show that Yuvion VL-32B achieves industry-leading safety performance, surpassing comparably sized open-source models and best closed-source commercial models, while maintaining comparable general capabilities.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.