ELDOR: A Dataset and Benchmark for Illegal Gold Mining in the Amazon Rainforest
Quick Take
ELDOR is a UAV benchmark dataset for monitoring illegal gold mining impacts in the Amazon rainforest.
Key Points
- Contains 2,500 hectares of annotated orthomosaic imagery.
- Establishes four benchmark tasks for model evaluation.
- Highlights challenges in detecting small-scale mining structures.
📖 Reader Mode
~2 min readAuthors:Kangning Cui, Surendra Bohara, Suraj Prasai, Zishan Shao, Wei Tang, Martin Pillaca, Edwin Flores, Zhen Yang, Gregory Larsen, Evan Dethier, David Lutz, Jean-Michel Morel, Miles Silman, Victor Pauca, Fan Yang
Abstract:Illegal gold mining in the Amazon rainforest causes deforestation, water contamination, and long-term ecosystem disruption, yet remains difficult to monitor at fine spatial scales. Satellite imagery supports large-scale observation, but often misses small mining-related structures and subtle land-cover transitions, especially under frequent cloud cover. We introduce ELDOR, a large-scale UAV benchmark for monitoring environmental and landscape disturbance from illegal gold mining in the rainforest. ELDOR contains manually annotated orthomosaic imagery covering over 2,500 hectares, with pixel-level semantic labels for both mining-related activities and surrounding ecological structures. With this unified annotation source, we establish four benchmark tasks: semantic segmentation, segmentation-derived recognition, direct multi-label classification, and class-presence recognition with vision-language models. Across these tasks, we compare generic and remote-sensing-specific segmentation models, vision foundation model-related segmentation methods, direct multi-label classification methods, and vision-language models under a controlled closed-set protocol. Results show that current methods still struggle with rare small-scale mining structures and fine-grained recovery classes, suggesting the need for context-aware and multimodal modeling. To support domain analysis and practical use, we further build an interactive explorer for domain experts that provides a unified interface for data exploration and model inference.
| Comments: | 70 pages, 35 figures, 28 tables |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2605.15397 [cs.CV] |
| (or arXiv:2605.15397v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.15397 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Kangning Cui [view email]
[v1]
Thu, 14 May 2026 20:30:25 UTC (20,410 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning
GeoSym127K introduces a scalable neuro-symbolic framework for enhanced geometric reasoning in multimodal models.