A Data Efficiency Study of Synthetic Fog for Object Detection Using the Clear2Fog Pipeline

arXiv cs.CV·Mohamed Ahmed Mohamed, Xiaowei Huang

3d ago

·~2 min·5/14/2026·en·1

Quick Take

The Clear2Fog pipeline enhances object detection in foggy conditions using synthetic data for improved model training.

Key Points

C2F simulates fog on clear datasets, ensuring sensor consistency.
Models trained on mixed-density fog outperform fixed-density ones.
Fine-tuning improves performance by overcoming synthetic biases.

📖 Reader Mode

~2 min read

[Submitted on 12 May 2026]

View PDF HTML (experimental)

Abstract:Object detection in adverse weather is critical for the safety of autonomous vehicles; however, the scarcity of labelled, real-world foggy data remains a significant bottleneck. In this paper, we propose Clear2Fog (C2F), an end-to-end, physics-based pipeline that simulates fog on clear-weather datasets while ensuring sensor-level consistency across camera and LiDAR. By using monocular depth estimation and a novel atmospheric light estimation method, C2F overcomes structural artifacts and chromatic biases common in existing techniques. A human perceptual study confirms C2F's physical realism, with the generated images being preferred 92.95% of the time over an established method. Utilising a training set of 270,000 images from the Waymo Open Dataset, we conduct an extensive data efficiency study to investigate how environmental diversity influences model robustness. Our findings reveal that models trained on mixed-density fog datasets at 75% scale outperform those trained on fixed-density datasets at 100% scale. Furthermore, we investigate the sim-to-real transfer by fine-tuning pre-trained models on real-world foggy data. We demonstrate that a tenfold increase over the default fine-tuning learning rate successfully overcomes negative transfer from synthetic biases, resulting in a 1.67 mAP improvement over real-only baselines. The C2F pipeline provides a scalable framework for enhancing the reliability of autonomous systems in adverse weather and demonstrates the potential of diverse synthetic datasets for efficient model training.

Comments:	Project code and experimental configs available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2605.12608 [cs.CV]
	(or arXiv:2605.12608v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2605.12608 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Mohamed Ahmed Mohamed [view email]
[v1] Tue, 12 May 2026 18:01:00 UTC (1,588 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

A Data Efficiency Study of Synthetic Fog for Object Detection Using the Clear2Fog Pipeline

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.CV

CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers

ProtoMedAgent: Multimodal Clinical Interpretability via Privacy-Aware Agentic Workflows

Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers

Related in this space

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems

Enhanced and Efficient Reasoning in Large Learning Models

Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study