From Simulation to Real-World: An In-Field 6D Pose Dataset and Baseline for Robotic Strawberry Harvesting
Quick Answer
This study introduces the first real-world 6D pose dataset for robotic strawberry harvesting, comprising 12,040 images collected in agricultural fields.
Quick Take
This study introduces the first real-world 6D pose dataset for robotic strawberry harvesting, comprising 12,040 images collected in agricultural fields. It highlights a significant sim-to-real gap in pose estimation, emphasizing the need for real data to evaluate performance accurately.
Key Points
- First real-world 6D pose dataset for strawberries with 12,040 images.
- Synthetic dataset created using NVIDIA Isaac Sim for scene-level realism.
- Significant sim-to-real gap identified in pose estimation performance.
- Baseline results provided for future reference in 6D pose estimation.
- Real-world dataset will be available upon acceptance of the paper.
Paper Resources
Article Excerpt
From source RSS / original summaryarXiv:2606. 11381v1 Announce Type: new Abstract: Robotic strawberry harvesting requires precise 6D pose estimation; however, collecting 6D pose ground truth in real agricultural fields is inherently challenging. Existing 6D pose estimation methods have therefore relied solely on synthetic data that lacks scene-level realism, leaving their performance under real agricultural field conditions unquantified.
In this work, we present, to the best of our knowledge, the first real-world 6D pose ground truth dataset of strawberries collected in actual agricultural fields (12,040 images). We also introduce a synthetic dataset rendered in NVIDIA Isaac Sim, featuring scene-level realism and domain randomization. Nevertheless, our experiments reveal that a significant sim-to-real gap persists, underscoring the necessity of real agricultural field data for reliable evaluation.
We further quantify the sim-to-real gap through baseline 6D pose estimation results across backbone encoders, serving as a reference for future work. The real-world dataset will be made available upon acceptance.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.