reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Pre-trained Perceptual Features Improve Differentially Private Image Generation

Authors: Frederik Harder, Milad Jalali, Danica J. Sutherland, Mijung Park

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We successfully generate reasonable samples for CIFAR-10, Celeb A, MNIST, and Fashion MNIST in high-privacy regimes. We also theoretically analyze the effect of privatizing our loss function, helping understand the privacy-accuracy trade-offs in our method. Our code is available at https://github.com/Park Lab ML/DP-MEPF.1
Researcher Affiliation	Academia	Frederik Harder EMAIL Max Planck Institute for Intelligent Systems & University of Tübingen Milad Jalali Asadabadi EMAIL University of British Columbia Danica J. Sutherland EMAIL University of British Columbia & Alberta Machine Intelligence Institute Mijung Park EMAIL University of British Columbia & Alberta Machine Intelligence Institute
Pseudocode	No	The paper describes the method "DP-MEPF" in Section 3.1 titled "MMD with perceptual features as a feature map" with detailed textual explanations and a diagram (Figure 1), but it does not present any formal pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/Park Lab ML/DP-MEPF.1
Open Datasets	Yes	Datasets. We considered four image datasets3 of varying complexity. We started with the commonly used datasets MNIST (Le Cun & Cortes, 2010) and Fashion MNIST (Xiao et al., 2017)... We also looked at the more complex Celeb A (Liu et al., 2015) dataset... We also study CIFAR-10 (Krizhevsky, 2009)...
Dataset Splits	No	The paper mentions using well-known datasets like MNIST, Fashion MNIST, Celeb A, and CIFAR-10, and evaluates models on 'downstream accuracies' and 'real data test sets'. However, it does not explicitly provide details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or references to specific predefined splits within these datasets).
Hardware Specification	No	The paper states, 'We implemented our code for all the experiments in Py Torch (Paszke et al., 2019)...' but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies	No	We implemented our code for all the experiments in Py Torch (Paszke et al., 2019), using the auto-dp package4 (Wang et al., 2019) for the privacy analysis... We computed FID scores with the pytorch_fid package (Seitzer, 2020)... We computed the downstream accuracy on MNIST and Fashion MNIST using the logistic regression and MLP classifiers from scikit-learn (Pedregosa et al., 2011). For CIFAR-10, we used Res Net9 taken from FFCV5 (Leclerc et al., 2022).
Experiment Setup	Yes	For each dataset, we tune the generator learning rate (LRgen) and moving average learning rate (LRmavg) from choices 10 k and 3 10 k with k {3, 4, 5}... After some initial unstructured experimentation, hyperparameters are chosen with identical values across dataset shown in 8. Table 7: Learning rate hyperparameters across datasets. Table 8: Hyperparameters fixed across datasets. Table 9: Hyperparameters of DP-GAN for Cifar10 and Celeb A.