DOME: Learning Transferable Domain Variables from Sparse Supervision for Test-Time Adaptation
Quick Answer
DOME introduces a novel domain encoder for test-time adaptation, effectively modeling sample-specific domains with sparse supervision.
Quick Take
DOME introduces a novel domain encoder for test-time adaptation, effectively modeling sample-specific domains with sparse supervision. By leveraging vision-language pretraining, it achieves state-of-the-art performance on ImageNet benchmarks, outperforming complex methods with a simple entropy-minimization strategy.
Key Points
- DOME models each sample's domain in a zero-shot manner for robust adaptation.
- It uses vision-language pretraining to extract dense, continuous representations.
- A momentum-updated sparse domain bank provides disentangled supervision.
- Achieves state-of-the-art results on ImageNet-C, ImageNet-R, and ImageNet-Sketch.
- Demonstrates that structured domain representation is key to effective adaptation.
Article Excerpt
From source RSS / original summaryarXiv:2606. 07646v1 Announce Type: new Abstract: Test-time adaptation (TTA) aims to align a model to shifting test domains using only unlabeled streaming data. Most existing methods implicitly infer a single global domain distribution, ignoring the multidimensional and sample-specific nature of real-world domain shifts, leading to fragile adaptation. We propose DOME, an effective domain encoder that explicitly models each sample's domain in a zero-shot manner.
DOME leverages vision-language pretraining to extract dense, continuous representations, parameterizes domains as distributional variables, and introduces a momentum-updated sparse domain bank for disentangled supervision. By injecting these explicit domain cues into downstream models, even a basic entropy-minimization TTA strategy achieves state-of-the-art performance across ImageNet-C, ImageNet-R, and ImageNet-Sketch, outperforming complex TTA approaches.
Our results demonstrate that robust adaptation stems not from intricate adaptation algorithms, but from explicit, structured domain representation.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.