Pre-trained Perceptual Features Improve Differentially Private Image Generation
Authors: Frederik Harder, Milad Jalali, Danica J. Sutherland, Mijung Park
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We successfully generate reasonable samples for CIFAR-10, Celeb A, MNIST, and Fashion MNIST in high-privacy regimes. We also theoretically analyze the effect of privatizing our loss function, helping understand the privacy-accuracy trade-offs in our method. Our code is available at https://github.com/Park Lab ML/DP-MEPF.1 |
| Researcher Affiliation | Academia | Frederik Harder EMAIL Max Planck Institute for Intelligent Systems & University of Tübingen Milad Jalali Asadabadi EMAIL University of British Columbia Danica J. Sutherland EMAIL University of British Columbia & Alberta Machine Intelligence Institute Mijung Park EMAIL University of British Columbia & Alberta Machine Intelligence Institute |
| Pseudocode | No | The paper describes the method "DP-MEPF" in Section 3.1 titled "MMD with perceptual features as a feature map" with detailed textual explanations and a diagram (Figure 1), but it does not present any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Park Lab ML/DP-MEPF.1 |
| Open Datasets | Yes | Datasets. We considered four image datasets3 of varying complexity. We started with the commonly used datasets MNIST (Le Cun & Cortes, 2010) and Fashion MNIST (Xiao et al., 2017)... We also looked at the more complex Celeb A (Liu et al., 2015) dataset... We also study CIFAR-10 (Krizhevsky, 2009)... |
| Dataset Splits | No | The paper mentions using well-known datasets like MNIST, Fashion MNIST, Celeb A, and CIFAR-10, and evaluates models on 'downstream accuracies' and 'real data test sets'. However, it does not explicitly provide details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or references to specific predefined splits within these datasets). |
| Hardware Specification | No | The paper states, 'We implemented our code for all the experiments in Py Torch (Paszke et al., 2019)...' but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | We implemented our code for all the experiments in Py Torch (Paszke et al., 2019), using the auto-dp package4 (Wang et al., 2019) for the privacy analysis... We computed FID scores with the pytorch_fid package (Seitzer, 2020)... We computed the downstream accuracy on MNIST and Fashion MNIST using the logistic regression and MLP classifiers from scikit-learn (Pedregosa et al., 2011). For CIFAR-10, we used Res Net9 taken from FFCV5 (Leclerc et al., 2022). |
| Experiment Setup | Yes | For each dataset, we tune the generator learning rate (LRgen) and moving average learning rate (LRmavg) from choices 10 k and 3 10 k with k {3, 4, 5}... After some initial unstructured experimentation, hyperparameters are chosen with identical values across dataset shown in 8. Table 7: Learning rate hyperparameters across datasets. Table 8: Hyperparameters fixed across datasets. Table 9: Hyperparameters of DP-GAN for Cifar10 and Celeb A. |