Degradation Attacks on Certifiably Robust Neural Networks
Authors: Klas Leino, Chi Zhang, Ravi Mangal, Matt Fredrikson, Bryan Parno, Corina Pasareanu
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation is designed to measure the susceptibility of state-of-the-art robust models and certified run-time defenses to utility degradation attacks. For our experiments, we consider Glo Ro Nets (Leino et al., 2021) and Randomized Smoothed models (Cohen et al., 2019). These approaches lead to models with the best known verified robust accuracies (VRA), with respect to the ℓ2 metric, on a variety of popular image classification datasets like MNIST (Le Cun et al., 2010), CIFAR-10 (Krizhevsky, 2009), and Image Net (Deng et al., 2009). |
| Researcher Affiliation | Academia | Klas Leino EMAIL Carnegie Mellon University Chi Zhang EMAIL Carnegie Mellon University Ravi Mangal EMAIL Carnegie Mellon University Matt Fredrikson EMAIL Carnegie Mellon University Bryan Parno EMAIL Carnegie Mellon University Corina Păsăreanu EMAIL Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 2.1: Prediction with a certified run-time defense Algorithm 3.1: Degradation attack algorithm Algorithm 3.2: Smoothed projected gradient descent attack (SPGD) Algorithm B.1: Degradation attack algorithm |
| Open Source Code | Yes | Our code is available at https://github.com/ravimangal/degradation-attacks. |
| Open Datasets | Yes | on a variety of popular image classification datasets like MNIST (Le Cun et al., 2010), CIFAR-10 (Krizhevsky, 2009), and Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | To measure the efficacy of degradation attacks on a particular model with a certified run-time defense and a dataset, we construct two subsets of the test set for the given dataset, that we refer to as test R and test A. The former, test R, is the set of test inputs on which the model is accurate and certified to be (ϵ, ℓp)-locally robust. |
| Hardware Specification | Yes | All our experiments were run on an NVIDIA TITAN RTX GPU with 24 GB of RAM, and a 4.2GHz Intel Core i7-7700K with 32 GB of RAM. |
| Software Dependencies | No | We implemented our attacks in Python, using Tensor Flow and Py Torch. No version numbers are specified for Python, TensorFlow, or PyTorch. |
| Experiment Setup | Yes | For MNIST (ϵ = 0.3), the model has two convolution layers and two fully-connected layers (2C2F), for MNIST (ϵ = 1.58), the model is 4C3F, and for CIFAR-10 (ϵ = 0.141), the model is 6C2F. For CIFAR-10, the Randomized Smoothed model uses a 110-layer residual network as the base classifier, and for Image Net, a Res Net-50 model is used as the base classifier. Our SPGD procedure is given in Algorithm 3.2, and it bears close similarity to the Smooth Adv PGD procedure for attacking smoothed classifiers presented by Salman et al. (2019). Algorithm 3.2 specifies step size η, number of steps N, number of samples n, and noise parameter σ. |