RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models

Authors: Tanqiu Jiang, Changjiang Li, Fenglong Ma, Ting Wang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive evaluation using benchmark datasets and models demonstrates that, with the same privacy guarantee, RAPID significantly outperforms state-of-the-art approaches by large margins in generative quality, memory footprint, and inference cost, suggesting that retrieval-augmented DP training represents a promising direction for developing future privacy-preserving generative models.
Researcher Affiliation Academia Tanqiu Jiang Changjiang Li Fenglong Ma Ting Wang Stony Brook University Pennsylvania State University
Pseudocode Yes Algorithm 1: Training latent feature extractor. Algorithm 2: RAPID.
Open Source Code Yes The code is available at: https://github.com/Tanqiu Jiang/RAPID.
Open Datasets Yes Specifically, we consider the following 4 settings: i) EMNIST (Cohen et al., 2017) (public) and MNIST (Deng, 2012) (private), ii) Image Net32 Deng et al. (2009) (public) and CIFAR10 (Krizhevsky et al., 2009) (private), iii) FFHQ32 (Karras et al., 2019) (public) and Celeb A32 (Liu et al., 2015) (private), and iv) FFHQ64 (Karras et al., 2019) (public) and Celeb A64 (Liu et al., 2015) (private). More details of these datasets are deferred to Table 5.
Dataset Splits Yes Table 5 summarizes the setting of public and private datasets in our experiments. Public dataset Dpub Private dataset Dprv Pre-training (Dpub pre Trajectory knowledge base Dpub ref EMNIST (50K) EMNIST (10K) MNIST Image Net32 (1.2M) Image Net32 (70K) (Darlow et al., 2018) CIFAR10 FFHQ32 (60K) FFHQ32 (10K) Celeb A32 FFHQ64 (60K) FFHQ64 (10K) Celeb A64
Hardware Specification Yes To simulate settings with modest compute resources, all the experiments are performed on a workstation running one Nvidia RTX 6000 GPU.
Software Dependencies No We use OPACUS Yousefpour et al. (2021), a DP-SGD library, for DP training and privacy accounting. Following prior work (Dockhorn et al., 2023), we fix the setting of δ as 10 5 for the CIFAR10 and MNIST datasets and 10 6 for Celeb A dataset so that δ is smaller than the reciprocal of the number of training samples.
Experiment Setup Yes We primarily use the latent diffusion model (Rombach et al., 2022) as the underlying diffusion model and DDIM (Song et al., 2020) as the default sampler. To make a fair comparison, we fix the default batch size as 64 for RAPID and DP-LDM; we do not modify the batch size (i.e., 8,192) for DPDM because the impact of batch size on its performance is so significant that it stops generating any recognizable images with smaller batch sizes. By default, we fix the sampling timesteps as 100 across all the methods. For MNIST, we train the diffusion model under three privacy settings ϵ = {0.2, 1, 10}, corresponding to the low, medium, and high privacy budgets; for the other datasets, we vary the privacy budget as ϵ = {1, 10}. Table 6. Hyper-parameters for training autoencoders under different settings. Table 7. Hyper-parameters for diffusion models under different settings.