DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset
Quick Take
DraDDP is the first publicly available English multimodal dataset for multi-party dialogue discourse parsing, featuring 495 dialogue segments and 9.1 hours of video from American TV dramas. This dataset enables comprehensive benchmarks and demonstrates the importance of multimodal information in understanding dialogue structures and relations.
Key Points
- DraDDP includes 6,374 utterances across various multi-party interactions.
- The dataset is based on American TV dramas, enhancing real-world applicability.
- Comprehensive benchmarks reveal the impact of different modalities on dialogue parsing.
- The dataset, guidelines, and code will be publicly released for future research.
Article Excerpt
From source RSS / original summaryarXiv:2606. 00012v1 Announce Type: new Abstract: Multi-party dialogue discourse parsing aims to identify dependency structures and relation types between utterances in conversations. Previous studies are mostly limited to textual modality or two-party dialogue, failing to meet the multimodal and multi-party settings. In this paper, we construct the first publicly available English multimodal dataset DraDDP for multi-party dialogue discourse parsing, based on American TV dramas.
DraDDP contains 495 dialogue segments with 6,374 utterances and 9. 1 hours of parallel video content, covering rich multi-party interaction scenarios. Moreover, we establish comprehensive benchmarks by evaluating this task on DraDDP and conducting in-depth analysis on the impact of different modalities. Experimental results demonstrate the value of multimodal information in capturing dialogue structures and relation types.
We will publicly release the dataset, annotation guidelines, and code to promote future research in multimodal dialogue understanding.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.