Look-Closer-Then-Diagnose: Confidence-Aware Ultrasound VQA via Active Zooming
Quick Take
A new framework enhances ultrasound diagnosis by integrating interactive zooming and uncertainty-aware rewards.
Key Points
- Introduces Zoom-then-Diagnose paradigm for lesion-focused reasoning.
- Implements uncertainty-aware rewards to boost model confidence.
- Achieves 39.3% improvement in lesion localization across datasets.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning
GeoSym127K introduces a scalable neuro-symbolic framework for enhanced geometric reasoning in multimodal models.