Answer Self-Consistency with Margin-Triggered Question Re-Arbitration for the CVPR 2026 VidLLMs Challenge

arXiv cs.CV·Tomoya Miyazawa, Hiroyasu Okuno

3h ago

·~2 min·6/4/2026·en·0

Quick Take

The ASC-MQRA framework enhances video question-answering for the CVPR 2026 challenge by employing answer self-consistency and a conditional re-arbitration module. It achieved 81.16% average accuracy on test data, outperforming single-pass inference methods. The final submission utilized ASC without re-arbitration due to sensitivity issues, highlighting the importance of uncertainty signals in model performance.

Key Points

ASC-MQRA combines answer self-consistency with margin-triggered question re-arbitration.
Achieved 81.16% average accuracy and 80.91% category-wise macro average accuracy on test.
Low-margin examples retain ground-truth answers, improving uncertainty handling.
Final submission utilized ASC alone due to MQRA's performance sensitivity.
Code available at https://github.com/data-analytics-labo/ASC-MQRA.

Article Content

From source RSS / original summary

arXiv:2606. 04323v1 Announce Type: new Abstract: In this report, we present our solution for Track 2 of the CVPR 2026 VidLLMs Challenge. This track evaluates visual relational reasoning in videos, where models must infer relations that are not always explicitly visible. We propose Answer Self-Consistency with Margin-Triggered Question Re-Arbitration (ASC-MQRA), a training-free test-time reasoning framework built on a multimodal reasoning model.

The core ASC component performs multiple stochastic video question-answering runs and aggregates their answer choices through answer-level self-consistency. This substantially improves over single-pass inference and forms our final test submission. We further study MQRA, a conditional re-arbitration module for low-margin examples where the first-stage vote distribution indicates uncertainty.

Our vote-margin analysis shows that low-margin examples often retain the ground-truth answer among the top candidates, motivating MQRA to narrow the candidate set and re-watch the video only over the retained candidates. On validation, MQRA further improves over ASC, indicating that low-margin vote distributions can provide a useful uncertainty signal.

On test, however, MQRA slightly degrades performance relative to ASC, suggesting that re-arbitration is sensitive to the size and category distribution of the triggered subset. Our final test submission therefore uses ASC without re-arbitration, achieving 72. 73 average accuracy and 78. 34 category-wise macro average accuracy on validation, and 81. 16 average accuracy and 80. 91 category-wise macro average accuracy on test.

This report details our prompting strategy, implementation setup, ablation studies, and diagnostic analyses. The code is available at https://github. com/data-analytics-labo/ASC-MQRA

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shimon Malnick, Matan Rusanovsky, Ohad Fried, Shai Avidan

3h ago

Original

Optimal Transport Flow Matching by Design

AI Summary

The study presents a novel approach to optimal transport (OT) flow matching, reformulating the problem by treating the prior as a design choice. This method achieves over 2x reduction in trajectory curvature compared to existing methods, improving generation quality in few-step regimes without altering the flow model. The approach integrates seamlessly with latent-space models and classifier-free guidance.

#AI Coding #Inference #Open Source