Learning Emergent Modular Representations in Multi-modality Medical Vision Foundation Models
Quick Take
The DEX model enhances multi-modality medical vision by balancing specialization and coordination in modular representations.
Key Points
- Addresses Non-IID feature statistics in medical imaging.
- Introduces a modular network with dynamic expert adaptation.
- Demonstrates improved performance across 26 downstream tasks.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning
GeoSym127K introduces a scalable neuro-symbolic framework for enhanced geometric reasoning in multimodal models.