Differentially Private Image Classification from Features

Authors: Harsh Mehta, Walid Krichene, Abhradeep Guha Thakurta, Alexey Kurakin, Ashok Cutkosky

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With DP-FC, we obtain new SOTA results on Image Net-1k, CIFAR-100 and CIFAR-10 across all values of ε typically considered. Most remarkably, on Image Net-1K, we obtain top-1 accuracy of 88% under DP guarantee of (8, 8 10 7) and 84.3% under (0.1, 8 10 7). Table 1: Compilation of our best private Top-1 test accuracies. We report median and standard deviation across 5 training runs with different seeds.
Researcher Affiliation Collaboration Harsh Mehta EMAIL Google Research Walid Krichene EMAIL Google Research Abhradeep Thakurta EMAIL Google Research Alexey Kurakin EMAIL Google Research Ashok Cutkosky EMAIL Boston University
Pseudocode Yes Algorithm 1 Differentially Private Newton s Method. Algorithm 2 Differentially Private Least Squares. Algorithm 3 Differentially Private SGD with Feature Covariance (DP-FC) Method. Algorithm 4 Generalized First Order Differentially Private Algorithm.
Open Source Code Yes Code: https://github.com/google-research/google-research/tree/master/dp_transfer
Open Datasets Yes We use 3 datasets for private finetuning, namely 1) Image Net-1k (Deng et al., 2009) with 1k classes and 1.3M images 2) CIFAR-10 and 3) CIFAR-100. We also refer to these as the private dataset for which we want a privacy guarantee.
Dataset Splits Yes We fine-tune on Image Net train split and present the Top-1 accuracies we obtain from the official test split. ... For tuning hyperparameters, we heldout 5% of the training set as our validation set and report top-1 accuracies on on the test set by using the tuned hyperparameters.
Hardware Specification Yes We conduct all our experiments in Jax (Bradbury et al., 2018; Frostig et al., 2018) is framework that leverages just-in-time compilation using XLA2 and does auto-vectorization of the backward pass. We leverage this functionality throughout our experiments. Finally, we conduct our experiments on TPUv4 architecture. ... Finally, Vi T-G/14 model was pre-trained using 2048 TPUv3 chips.
Software Dependencies No We conduct all our experiments in Jax (Bradbury et al., 2018; Frostig et al., 2018) is framework that leverages just-in-time compilation using XLA2 and does auto-vectorization of the backward pass. ... Our implementation relies on Tensorflow Privacy 1 codebase for conversion of (ε, δ) and clipping norm C to/from noise multiplier σ. The paper mentions software tools like Jax and TensorFlow Privacy but does not provide specific version numbers for these dependencies.
Experiment Setup Yes D.3 Hyperparameter Tuning: All of our results consider full-batch setting with a constant learning rate (when applicable). Additionally, following Mehta et al. (2022), we set initial weights to 0.0. ... Table 7: Fine-tuning hyperparams for DP-Adam. All models are trained in full-batch setting with a constant learning rate and no dropout. When training the models with DP, we replace the global clipping with per example clipping norm as specified in the table. Following Mehta et al. (2022), we set initial weights to 0.0, bias to -10.0 and train with sigmoid cross-entropy loss.