reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Augment then Smooth: Reconciling Differential Privacy with Certified Robustness

Authors: Jiapeng Wu, Atiyeh Ashari Ghomi, David Glukhov, Jesse C. Cresswell, Franziska Boenisch, Nicolas Papernot

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effectiveness of DP-CERT on multiple image classification datasets, including MNIST (Le Cun et al., 2010), Fashion-MNIST (Xiao et al., 2017), and CIFAR10 (Krizhevsky & Hinton, 2009).
Researcher Affiliation	Collaboration	Jiapeng Wu EMAIL Layer 6 AI Atiyeh Ashari Ghomi EMAIL Layer 6 AI David Glukhov EMAIL University of Toronto & Vector Institute Jesse C. Cresswell EMAIL Layer 6 AI Franziska Boenisch EMAIL CISPA Nicolas Papernot EMAIL University of Toronto & Vector Institute
Pseudocode	Yes	Algorithm 1 Standard DPSGD, adapted from (Abadi et al., 2016). Require: Private training set D = {(xi, yi) \| i [Nprv]}, loss function L(θt; x, y), Parameters: learning rate λt, noise scale ρ, group size B, gradient norm bound C.
Open Source Code	Yes	Code is available at github.com/layer6ailabs/dp-cert.
Open Datasets	Yes	We evaluate the effectiveness of DP-CERT on multiple image classification datasets, including MNIST (Le Cun et al., 2010), Fashion-MNIST (Xiao et al., 2017), and CIFAR10 (Krizhevsky & Hinton, 2009).
Dataset Splits	Yes	The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples, as does Fashion-MNIST. The CIFAR10 dataset consists of 60,000 RGB images from 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images of size 32 32 3.
Hardware Specification	Yes	All experiments were conducted on a cluster of 8 Nvidia V100 GPUs.
Software Dependencies	Yes	All the training and inference procedures are implemented based on Pytorch v1.13.0 (Paszke et al., 2019) and Opacus v1.3.0 (Yousefpour et al., 2021)
Experiment Setup	Yes	We set the learning rate as 0.001 and train the models for 10 epochs. The rest of the hyperparameters are the same as used by Bu et al. (2022b). For evaluation, we use CERTIFY with parameters n = 10, 000, n0 = 100, and α = 0.001, following previous work (Cohen et al., 2019; Salman et al., 2019). We set the number of augmentations K to 2 for MNIST and Fashion-MNIST, and 1 for CIFAR10, as they bring a better trade-off between certified accuracy and efficiency (see Figure 4).