DarkVGGT: Seeing Through Darkness Using Thermal Geometry without Daylight Tax
Quick Answer
DarkVGGT introduces a novel RGB-T feed-forward geometry framework that enhances 3D scene estimation in low-light conditions using thermal modeling.
Quick Take
DarkVGGT introduces a novel RGB-T feed-forward geometry framework that enhances 3D scene estimation in low-light conditions using thermal modeling. It outperforms existing methods in depth and camera pose estimation on low-visibility benchmarks, addressing the limitations of traditional RGB-based approaches.
Key Points
- DarkVGGT employs physics-aware thermal modeling for robust 3D estimation.
- It features thermal factorization to extract reliable thermal cues.
- The framework improves depth and camera pose estimation in low-light scenarios.
- Experiments show consistent performance gains over existing geometry baselines.
- It maintains effectiveness in well-lit environments while enhancing low-light performance.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 11326v1 Announce Type: new Abstract: Recent feed-forward 3D reconstruction methods have demonstrated strong performance and flexibility in efficient end-to-end scene geometry estimation from image streams. However, their reliance on visible-light appearance makes them vulnerable in dark and low-visibility environments, where RGB cues are severely degraded and geometric evidence becomes ambiguous.
To address this challenge, we propose DarkVGGT, an RGB-T feed-forward geometry framework that uses physics-aware thermal modeling for robust 3D estimation in low-light scenes. DarkVGGT introduces two complementary modules. First, physics-inspired thermal factorization extracts emissive-dominant, geometry-consistent thermal cues while isolating sparse reflective residuals that may introduce geometric ambiguity.
Second, geometry-shared thermal routing isolates modality-invariant geometric structures from thermal-specific patterns, selectively injecting reliability-aware structural guidance into the RGB stream. Together, these components enable accurate thermal-informed geometry estimation under degraded RGB conditions while largely preserving performance in well-lit environments.
Experiments on low-visibility RGB-T benchmarks demonstrate consistent improvements in both depth and camera pose estimation over existing feed-forward geometry baselines.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.