No One Knows the State of the Art in Geospatial Foundation Models
Quick Take
Current geospatial foundation models lack standardization, hindering effective comparison and innovation.
Key Points
- 46 cross-paper disagreements found in model evaluations.
- 94 out of 126 papers use unique pretraining configurations.
- 39% of GFM papers do not release model weights.
📖 Reader Mode
~2 min readAbstract:Geospatial foundation models (GFMs) have been proposed as generalizable backbones for disaster response, land-cover mapping, food-security monitoring, and other high-stakes Earth-observation tasks. Yet the published work about these models does not give reviewers or users enough information to tell which model fits a given task. We argue that nobody knows what the current state of the art is in geospatial foundation models. The methods may be useful, but the GFM literature does not standardize evaluations, training and testing protocols, released weights, or pretraining controls well enough for anyone to compare or rank them. In a 152-paper audit, we find 46 cross-paper disagreements of at least 10 points for the same model, benchmark, and protocol; 94/126 papers with extractable pretraining data use a configuration no other paper uses; and 39% of GFM papers release no model weights. This lack of community standards can be solved. We propose six concrete expectations: named-license weight release, shared core evaluations, copied-versus-rerun baseline annotations, variance reporting, one shared evaluation harness, and data-vs-architecture-vs-algorithm controls. These gaps are a coordination failure, not a fault of any individual lab; the authors of this paper, like many others in the GFM community, have contributed to them. Rather than just critiquing the community, we aim to provide concrete steps toward a shared understanding of how to innovate GFMs.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY) |
| Cite as: | arXiv:2605.12678 [cs.CV] |
| (or arXiv:2605.12678v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.12678 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Isaac Corley [view email]
[v1]
Tue, 12 May 2026 19:29:51 UTC (69 KB)
— Originally published at arxiv.org
More from arXiv cs.CV
See more →CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers
CoReDiT enhances Diffusion Transformers by optimizing token pruning for efficiency and quality.
