DeceptionX: Explainable Deception Detection with Multimodal Large Language Models
Quick Answer
DeceptionX introduces a novel multimodal large language model framework for deception detection, transitioning from black-box classification to an interpretable Observe-Think-Summarize process.
Quick Take
DeceptionX introduces a novel multimodal large language model framework for deception detection, transitioning from black-box classification to an interpretable Observe-Think-Summarize process. It utilizes the DeceptChain dataset, synthesizing audiovisual cues into structured reasoning data, outperforming existing methods on real-world benchmarks while enhancing interpretability.
Key Points
- DeceptionX shifts deception detection to an interpretable reasoning process.
- Utilizes DeceptChain, a dataset with fine-grained audiovisual evidence.
- Employs a three-stage training pipeline to enhance model generalization.
- Outperforms existing MLLM baselines on standard real-world benchmarks.
- Bridges the gap between accuracy and interpretability in deception detection.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 11385v1 Announce Type: new Abstract: Deception detection is a critical and highly challenging task within affective computing and behavioral analysis. Existing deep learning methods typically treat this task as a straightforward classification problem; however, this black-box approach lacks interpretability and fails to capture the complex logical deduction processes utilized by human experts when identifying lies.
While Multimodal Large Language Models (MLLMs) have shown potential, applying them effectively requires a bridge between low-level audiovisual cues and high-level logical reasoning. In this paper, we propose DeceptionX, a novel MLLM framework that shifts the paradigm of deception detection from black-box classification to an interpretable Observe-Think-Summarize reasoning process.
To address the scarcity of high-quality reasoning data, we first constructed DeceptChain, a high-quality dataset developed through a human-in-the-loop process. This dataset synthesizes fine-grained visual and auditory evidence (such as micro-expressions and vocal tremors) into structured chain-of-thought reasoning data. Furthermore, we propose a three-stage training pipeline and a Discrepancy-Aware Redundancy Elimination~(DARE) strategy for DeceptionX to further enhance the model's generalization capabilities.
Extensive experiments demonstrate that DeceptionX not only outperforms existing MLLM baselines and state-of-the-art methods on standard real-world benchmarks but also provides transparent, expert-level reasoning paths, bridging the critical gap between accuracy and interpretability in multimodal deception detection.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.