Robust Cross-Domain Generalization Using Unlabeled Target Data with Source-Domain Supervision
Quick Take
A novel self-supervised pretraining method enhances cross-domain generalization for pediatric wrist fracture assessment using unlabeled data from TeleMED probes, achieving a 6% Dice improvement over baseline. This approach leverages labeled data from Philips Lumify probes while maintaining privacy and reducing annotation costs, making it suitable for multi-center studies.
Key Points
- Proposed method combines masked image modeling and contrastive learning for target-domain representation.
- Achieved over 6% improvement in Dice score on 318 pediatric POCUS images.
- Utilizes labeled source data while adapting to unlabeled target data for better generalization.
- Addresses challenges of domain shifts in ultrasound imaging across different devices.
- Framework is extendable to multi-center studies and federated learning setups.
Article Content
From source RSS / original summaryarXiv:2605. 29122v1 Announce Type: new Abstract: It is often desirable to generalize medical imaging AI models trained with dense annotations to data acquired from different ultrasound scanners or clinical sites; however, retraining these models with new annotations is often difficult and costly. We examine this challenge in pediatric wrist fracture assessment using point-of-care ultrasound (POCUS), where fractures are common and can be effectively triaged via ultrasound.
AI has shown radiologist-level performance for fracture detection, often aided by high-quality bony structure segmentation. However, due to significant domain shifts, models perform poorly on data from other centers or probes, and obtaining segmentation labels across devices is impractical due to manual annotation effort and data privacy concerns. To address this, we propose a target-informed self-supervised pretraining and model-ensemble strategy.
Specifically, our approach combines masked image modeling (MIM) and contrastive learning to learn target-domain structural representations without labels, and introduces a confidence-aware infusion head to adaptively integrate predictions. The source dataset, collected with a Philips Lumify probe, contained dense labels, while the target dataset, acquired with a TeleMED portable probe, was unlabeled. The datasets were kept strictly separate throughout the entire process.
Our method used labeled source data for supervised training and leveraged target-domain pretraining to improve generalization. On 318 images from 62 pediatric POCUS videos, this approach significantly improved cross-device performance, achieving over 6% Dice improvement on the target domain versus the baseline.
These results demonstrate a label-efficient and privacy-preserving approach for cross-device-robust ultrasound AI, offering a framework that can be extended to multi-center studies or federated learning setups.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning
Evi-Steer introduces a novel evidential tuning framework for BiomedCLIP, achieving 0.11% parameter updates while enhancing uncertainty-aware fine-tuning. It outperforms state-of-the-art methods across 15 biomedical imaging datasets, proving effective in few-shot learning and domain shifts for clinical applications.