EgoTraj: Real-World Egocentric Human Trajectory Dataset for Multimodal Prediction

arXiv cs.CV·Ahmad Yehia, Abduallah Mohamed, Tianyi Wang, Jiseop Byeon, Kun Qian, Junfeng Jiao, Christian Claudel

17h ago

·~2 min·5/20/2026·en·0

Quick Take

EgoTraj is a new egocentric dataset for predicting human trajectories in urban environments.

Key Points

Contains 75 sequences of human navigation.
Includes synchronized RGB video and ground-truth data.
Supports AR-based perception and navigation systems.

📖 Reader Mode

~2 min read

[Submitted on 18 May 2026]

View PDF HTML (experimental)

Abstract:Accurately forecasting human trajectories from an egocentric perspective plays a central role in applications such as humanoid robotics, wearable sensing systems, and assistive navigation. However, progress in this direction remains limited due to the scarcity of egocentric trajectory datasets collected in real-world environments. Addressing this need, we introduce EgoTraj, an egocentric multimodal open dataset recorded using Meta Quest Pro (MQPro). EgoTraj contains 75 sequences of human navigation collected from multiple MQPro wearers in real-world urban environments. Each recording provides synchronized RGB video along with ground-truth data, including continuous time-synchronized 6-degree-of-freedom head poses, per-frame 3D eye gaze vectors, scene annotations. To the best of our knowledge, EgoTraj differs from typical egocentric trajectory datasets by capturing long-horizon, self-directed navigation across diverse urban routes with broad participant diversity. To demonstrate the potential of the dataset, we benchmark several state-of-the-art methods for egocentric trajectory prediction and conduct ablation studies to analyze the contributions of gaze, scene, and motion cues. The results highlight the utility of EgoTraj for AR-based perception, navigation, and assistive systems. The EgoTraj dataset, code, and EgoViz Dashboard are publicly available at this https URL.

Comments:	21 pages, 14 figures. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
ACM classes:	I.2.10; I.4.8; I.5.4
Cite as:	arXiv:2605.19004 [cs.CV]
	(or arXiv:2605.19004v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2605.19004 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Ahmad Yehia [view email]
[v1] Mon, 18 May 2026 18:26:51 UTC (44,620 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

EgoTraj: Real-World Egocentric Human Trajectory Dataset for Multimodal Prediction

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.CV

GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search

MedFM-Robust: Benchmarking Robustness of Medical Foundation Models

Related in this space

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

From Prompts to Protocols: An AI Agent for Laboratory Automation

Agentic Trading: When LLM Agents Meet Financial Markets