Answer Self-Consistency with Margin-Triggered Question Re-Arbitration for the CVPR 2026 VidLLMs Challenge
Quick Take
The ASC-MQRA framework enhances video question-answering for the CVPR 2026 challenge by employing answer self-consistency and a conditional re-arbitration module. It achieved 81.16% average accuracy on test data, outperforming single-pass inference methods. The final submission utilized ASC without re-arbitration due to sensitivity issues, highlighting the importance of uncertainty signals in model performance.
Key Points
- ASC-MQRA combines answer self-consistency with margin-triggered question re-arbitration.
- Achieved 81.16% average accuracy and 80.91% category-wise macro average accuracy on test.
- Low-margin examples retain ground-truth answers, improving uncertainty handling.
- Final submission utilized ASC alone due to MQRA's performance sensitivity.
- Code available at https://github.com/data-analytics-labo/ASC-MQRA.
Article Content
From source RSS / original summaryarXiv:2606. 04323v1 Announce Type: new Abstract: In this report, we present our solution for Track 2 of the CVPR 2026 VidLLMs Challenge. This track evaluates visual relational reasoning in videos, where models must infer relations that are not always explicitly visible. We propose Answer Self-Consistency with Margin-Triggered Question Re-Arbitration (ASC-MQRA), a training-free test-time reasoning framework built on a multimodal reasoning model.
The core ASC component performs multiple stochastic video question-answering runs and aggregates their answer choices through answer-level self-consistency. This substantially improves over single-pass inference and forms our final test submission. We further study MQRA, a conditional re-arbitration module for low-margin examples where the first-stage vote distribution indicates uncertainty.
Our vote-margin analysis shows that low-margin examples often retain the ground-truth answer among the top candidates, motivating MQRA to narrow the candidate set and re-watch the video only over the retained candidates. On validation, MQRA further improves over ASC, indicating that low-margin vote distributions can provide a useful uncertainty signal.
On test, however, MQRA slightly degrades performance relative to ASC, suggesting that re-arbitration is sensitive to the size and category distribution of the triggered subset. Our final test submission therefore uses ASC without re-arbitration, achieving 72. 73 average accuracy and 78. 34 category-wise macro average accuracy on validation, and 81. 16 average accuracy and 80. 91 category-wise macro average accuracy on test.
This report details our prompting strategy, implementation setup, ablation studies, and diagnostic analyses. The code is available at https://github. com/data-analytics-labo/ASC-MQRA
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →Optimal Transport Flow Matching by Design
The study presents a novel approach to optimal transport (OT) flow matching, reformulating the problem by treating the prior as a design choice. This method achieves over 2x reduction in trajectory curvature compared to existing methods, improving generation quality in few-step regimes without altering the flow model. The approach integrates seamlessly with latent-space models and classifier-free guidance.