M2Retinexformer: Multi-Modal Retinexformer for Low-Light Image Enhancement
Quick Take
M2Retinexformer enhances low-light images by integrating depth, luminance, and semantic features in a refined pipeline.
Key Points
- Incorporates multi-modal cues for improved image quality.
- Utilizes adaptive gating for dynamic attention balancing.
- Outperforms existing methods on multiple benchmarks.
📖 Reader Mode
~2 min readAbstract:Low-light image enhancement is challenging due to complex degradations, including amplified noise, artifacts, and color distortion. While Retinex-based deep learning methods have achieved promising results, they primarily rely on single-modality RGB information. We propose M2Retinexformer (Multi-Modal Retinexformer), a novel framework that extends Retinexformer by incorporating depth cues, luminance priors, and semantic features within a progressive refinement pipeline. Depth provides geometric context that is invariant to lighting variations, while luminance and semantic features offer explicit guidance on brightness distribution and scene understanding. Modalities are extracted at multiple scales and fused through cross-attention, with adaptive gating dynamically balancing illumination-guided self-attention and cross-attention based on the reliability of auxiliary cues. Evaluations on the LOL, SID, SMID, and SDSD benchmarks demonstrate overall improvements over Retinexformer and recent state-of-the-art methods. Code and pretrained weights are available at this https URL
| Comments: | Accepted at 2026 IEEE International Conference on Image Processing (ICIP) |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2605.12556 [cs.CV] |
| (or arXiv:2605.12556v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.12556 arXiv-issued DOI via DataCite |
Submission history
From: Youssef Aboelwafa [view email]
[v1]
Mon, 11 May 2026 12:13:13 UTC (591 KB)
— Originally published at arxiv.org
More from arXiv cs.CV
See more →CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers
CoReDiT enhances Diffusion Transformers by optimizing token pruning for efficiency and quality.
