RIDE: Retinex-Informed Decoupling for Exposing Concealed Objects
Quick Take
RIDE introduces a Retinex-based method for improved concealed object segmentation through homogeneous image decomposition.
Key Points
- Focuses on concealed object segmentation tasks.
- Utilizes homogeneous image decomposition via Retinex theory.
- Introduces a novel Discriminability Gap Theorem.
📖 Reader Mode
~2 min readAbstract:Concealed Object Segmentation (COS) encompasses a family of dense-prediction tasks, including camouflaged object detection, polyp segmentation, transparent object detection, and industrial defect inspection, where targets are visually entangled with their surroundings through different physical mechanisms. Existing methods either operate directly on RGB images or employ \emph{heterogeneous} decompositions (\eg, Fourier, wavelet) that redistribute spatial evidence across scale/frequency coefficients, making pixel-aligned cues less direct. We introduce a fundamentally different perspective: \textbf{homogeneous image decomposition} via Retinex theory, which factorizes an image into illumination and reflectance components within the \emph{same} spatial domain. Our key insight is that visual entanglement enforces appearance matching in the composite space, but this does \emph{not} necessitate simultaneous matching in both component spaces, a phenomenon we formalize as the \textbf{Discriminability Gap Theorem}. Crucially, we show that across diverse COS sub-tasks, the underlying physical processes systematically anti-correlate illumination and reflectance differences, yielding theoretical guarantees that Retinex decomposition preserves or strictly improves total foreground--background discriminability across the full physical regime, with anti-correlation maximizing the gain. Building on this, we propose \textbf{RIDE} comprising: (i) a Task-Driven Retinex Decomposition module that learns segmentation-optimal factorizations end-to-end; (ii) a Discriminability Gap Attention mechanism that adaptively exploits where decomposition helps; and (iii) a Camouflage-Breaking Contrastive loss operating in reflectance feature space.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.15450 [cs.CV] |
| (or arXiv:2605.15450v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.15450 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Chunming He [view email]
[v1]
Thu, 14 May 2026 22:19:52 UTC (7,650 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning
GeoSym127K introduces a scalable neuro-symbolic framework for enhanced geometric reasoning in multimodal models.