Overhead Wildlife Locator (OWL): Benchmarking Weakly Supervised Learning for Aerial Wildlife Surveys

arXiv cs.CV·Isai Daniel Chac\'on, Zhongqi Miao, Bruno Demuro, Caleb Robinson, Rahul Dodhia, Lasha Otarashvili, Jason Holmberg, Kirk Larsen, Howard Frederick, Nathan J. Pamperin, Pablo Arbel\'aez, Juan M. Lavista Ferres

6h ago

·~2 min·6/15/2026·en·0

Quick Answer

This paper shows that The Overhead Wildlife Locator (OWL) introduces a weakly supervised framework for aerial wildlife surveys, outperforming HerdNet with OWL-D achieving 0.934 AP.

Quick Take

The Overhead Wildlife Locator (OWL) introduces a weakly supervised framework for aerial wildlife surveys, outperforming HerdNet with OWL-D achieving 0.934 AP. OWL-C demonstrated high operational readiness, achieving F1 = 0.965 in the 2022 Central Arctic Caribou census, while reducing annotation costs significantly compared to traditional methods.

Key Points

OWL features three variants: OWL-C, OWL-T, and OWL-D for diverse aerial scenarios.
OWL-D sets a new benchmark with 0.934 AP on the Delplanque dataset.
OWL-T leads with 0.978 AP on the SheepCounter UAV dataset.
The framework reduces annotation costs by up to seven times compared to bounding-box methods.
Code and datasets for large-scale caribou surveys are publicly available on GitHub.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2606. 13911v1 Announce Type: new Abstract: Automated aerial wildlife surveys increasingly rely on deep learning, yet standard object detectors require bounding-box annotations, reported to be up to seven times slower and three times more expensive to produce than point-level labels.

To address this bottleneck, we introduce the Overhead Wildlife Locator (OWL), a weakly supervised density-estimation framework with three variants: OWL-C, a fully convolutional model for high-throughput screening; OWL-T, a Swin-augmented hybrid for heterogeneous, cluttered scenes; and OWL-D, built on a frozen DINOv3 ViT-H+/16 encoder with a DPT-style fusion decoder.

We benchmark all three against POLO, YOLOv11n, and YOLOv11l across five public aerial datasets, from sparse fixed-wing savanna surveys to dense UAV paddock imagery, and against the published HerdNet baseline on its native Delplanque split. OWL-D sets a new state of the art on Delplanque (0. 934 AP vs. HerdNet's 0. 840) and records the highest AP on four of the five datasets. Performance is regime-dependent: on the extreme-density SheepCounter UAV dataset the hybrid OWL-T leads (0.

978 AP) and the convolutional variants attain the lowest counting error, whereas the foundation-based OWL-D degrades, indicating which variant suits which survey type. We further validate operational readiness on the Alaska Department of Fish and Game's 2022 Central Arctic Caribou census: under cross-herd and cross-temporal transfer, OWL-C fine-tuned on the 2017 Porcupine Caribou Herd split attains F1 = 0. 965 on a held-out patch test set, with a signed count error of +3.

1% aggregated across the released test patches. We release the OWL code, model weights, and the annotated Porcupine Caribou Herd 2017 (PCH) and Central Arctic Herd 2022 (CAH) patches, the first open patch-level datasets for large-scale caribou aerial surveys, at https://github. com/microsoft/MegaDetector-Overhead.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

1w ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup