OrthoTrack: Continuous 6-DoF UAV Trajectory Estimation Anchored in Public Orthophotos
Quick Answer
OrthoTrack is a training-free system for continuous 6-DoF UAV trajectory estimation using public orthophotos, achieving real-time performance on a single GPU.
Quick Take
OrthoTrack is a training-free system for continuous 6-DoF UAV trajectory estimation using public orthophotos, achieving real-time performance on a single GPU. It significantly outperforms existing methods, providing absolute poses without GPS, and introduces the MovingDrone Dataset for benchmarking.
Key Points
- OrthoTrack uses public orthophotos and surface models for 6-DoF pose estimation.
- It propagates map-anchored correspondences using optical flow for real-time performance.
- The system outperforms all baselines, even those with oracle scale and alignment.
- MovingDrone Dataset pairs UAV sequences with dense 6-DoF ground truth data.
- Deployment to new regions requires no site-specific adaptation.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 25245v1 Announce Type: new Abstract: Continuous 6-DoF pose estimation is essential for autonomous UAV operations. Yet, existing visual odometry and SLAM methods accumulate drift and yield only relative, up-to-scale trajectories. Single-frame geo-localization, in turn, discards temporal continuity and remains too slow for real-time use.
We present OrthoTrack, a training-free system that estimates continuous 6-DoF UAV trajectories using only publicly available orthophotos and surface models as a map prior. OrthoTrack matches keyframes against the orthophoto and lifts correspondences to metric 3D via the surface model. It then propagates these map-anchored correspondences to intermediate frames with optical flow, producing absolute, metrically scaled poses at every frame without GPS or post-hoc alignment.
We also introduce the MovingDrone Dataset, a large-scale benchmark pairing photorealistic UAV sequences with dense 6-DoF ground truth and co-registered multi-modal geodata including multi-temporal orthophotos. On MovingDrone and real-world benchmarks, OrthoTrack runs in real time on a single GPU. It outperforms all baselines by a large margin, even those receiving oracle scale and alignment. By relying on publicly available geodata, OrthoTrack enables deployment to new regions without site-specific adaptation.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.


