ChangeFlow -- Latent Rectified Flow for Change Detection in Remote Sensing
Quick Take
ChangeFlow introduces a generative framework for efficient change detection in remote sensing images.
Key Points
- Models change detection in latent space using rectified flow.
- Achieves 80.4% average F1 score across four benchmarks.
- Maintains competitive inference speed compared to existing methods.
📖 Reader Mode
~2 min readAbstract:Remote sensing change detection (RSCD) aims to localise changes between two images of the same geographic region. In practice, change masks often follow region-level annotation conventions rather than purely local appearance differences, making them context-dependent and occasionally ambiguous. Most state-of-the-art methods utilise per-pixel discriminative classification, which produces a single prediction per input and fails to explicitly model the changed region as a coherent whole. A natural alternative is generative formulation, which can model a distribution of plausible masks, enabling sampling to capture ambiguity and encourage global consistency. However, existing generative RSCD approaches typically lag behind strong discriminative baselines due to the high computational cost of pixel-space generation and the complexity of their conditioning mechanisms. To address the limitations of prior discriminative and generative methods, we propose ChangeFlow, a generative framework that reformulates change detection as the synthesis of a change mask in latent space via rectified flow. ChangeFlow is guided by a structured yet lightweight conditioning signal, and its stochastic design naturally supports sampling-based prediction ensembling. Namely, aggregating multiple predicted change masks improves robustness, while sample agreement provides a practical confidence estimation that highlights ambiguous regions. Across four benchmarks, ChangeFlow achieves an average F1 of 80.4\%, improving by 1.3 points on average over the previous best method, while maintaining inference speed comparable to recent strong baselines. Project page: this https URL
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2605.15375 [cs.CV] |
| (or arXiv:2605.15375v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.15375 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Blaž Rolih [view email]
[v1]
Thu, 14 May 2026 20:04:16 UTC (13,745 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning
GeoSym127K introduces a scalable neuro-symbolic framework for enhanced geometric reasoning in multimodal models.