No Free Lunch for Synthetic Images under Data Scarcity Conditions

arXiv cs.CV·Borja Arroyo Galende, Alejandro Almod\'ovar, Patricia A. Apell\'aniz, Juan Parras, Silvia Uribe, Santiago Zazo

2h ago

·~1 min·6/9/2026·en·0

Quick Answer

This study evaluates the trade-offs of fidelity, privacy, and utility in synthetic data generation using VAE, GAN, and DDPM models under data scarcity.

Quick Take

This study evaluates the trade-offs of fidelity, privacy, and utility in synthetic data generation using VAE, GAN, and DDPM models under data scarcity. It finds that GAN and DDPM maintain higher fidelity and utility compared to VAE when differential privacy is applied, emphasizing the need for multidimensional evaluation of generative models.

Key Points

Evaluates VAE, GAN, and DDPM under data scarcity and privacy constraints.
GAN and DDPM show greater robustness in fidelity and utility than VAE.
Study spans three datasets: MNIST, OCTMNIST, and OrganAMNIST.
Differential privacy mechanisms significantly affect model performance.
Highlights need for multidimensional evaluation of generative models.

Article Excerpt

From source RSS / original summary

arXiv:2606. 07640v1 Announce Type: new Abstract: This study investigates the trade-offs between fidelity, privacy, and utility in synthetic data generation under conditions of data scarcity and privacy sensitivity. We propose an evaluation framework that jointly assesses these three dimensions and apply it to three widely used generative models, VAE, GAN, and DDPM. The evaluation spans three image datasets, MNIST, OCTMNIST, and OrganAMNIST, encompassing both general-purpose and medical imaging domains.

Notable differences arise between the three models in their behaviour when differential privacy mechanisms are introduced during training. GAN and DDPM demonstrate greater robustness, maintaining higher fidelity and downstream utility across a range of noise levels, while VAE degrades more rapidly as privacy constraints increase. This study highlights the importance of a multidimensional evaluation of deep generative models, also noting that their behaviour significantly differs when privacy techniques are applied.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

4d ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup