MAMVI: 3D Test-Time Adaptation via Masked Multi-View Point Clouds
Quick Answer
MAMVI introduces a unified single-step adaptation for 3D point cloud models, significantly improving test-time adaptation performance.
Quick Take
MAMVI introduces a unified single-step adaptation for 3D point cloud models, significantly improving test-time adaptation performance. It achieves state-of-the-art accuracy on ShapeNet-C and ScanObjectNN-C while being 4.9-8.9 times faster than traditional methods, making it ideal for real-time applications.
Key Points
- MAMVI replaces sequential optimization with a single-step adaptation process.
- Utilizes a hybrid masking strategy for stability and diversity in adaptation.
- Achieves state-of-the-art accuracy on ShapeNet-C and ScanObjectNN-C benchmarks.
- Delivers 4.9-8.9 times faster inference compared to existing multi-view methods.
- Code available at https://github.com/Inseok-kong/MAMVI.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 12939v1 Announce Type: new Abstract: 3D point cloud models suffer significant performance degradation under distribution shifts caused by sensor noise, occlusions, and environmental changes. Test-time adaptation (TTA) has emerged as a practical paradigm for mitigating this issue during inference. Recently, leveraging multi-view augmentation has shown promise in improving 3D TTA performance.
However, existing multi-view approaches are often constrained by sequential optimization that treats each view independently. This sequential optimization leads to substantial inference latency due to repetitive optimization steps, making real-time adaptation impractical. To address this, we propose Masked Multi-View Test-Time Adaptation (MAMVI), which replaces sequential optimization with a unified single-step adaptation.
Specifically, MAMVI utilizes a hybrid masking strategy that combines fixed ratios for stability with Beta-distributed sampling for diversity. By aggregating losses across multiple views, MAMVI performs adaptation through a single backward pass based on multi-view consensus. Additionally, a confidence-based adaptive learning rate is used to dynamically adjust the adaptation intensity for each sample.
Extensive experiments on ModelNet-40C, ShapeNet-C, and ScanObjectNN-C demonstrate that MAMVI achieves state-of-the-art accuracy on ShapeNet-C and ScanObjectNN-C. Moreover, it remains competitive on ModelNet-40C while delivering 4. 9-8. 9 times faster inference, making it highly suitable for real-time applications. Our code is available at https://github. com/Inseok-kong/MAMVI
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.