Robust Feature Inference: A Test-time Defense Strategy using Spectral Projections

Authors: Anurag Singh, Mahalakshmi Sabanayagam, Krikamol Muandet, Debarghya Ghoshdastidar

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments on CIFAR-10, CIFAR-100, tiny Image Net and Image Net datasets for several robustness benchmarks, including the state-of-the-art methods in Robust Bench show that RFI improves robustness across adaptive and transfer attacks consistently.
Researcher Affiliation Academia Anurag Singh EMAIL CISPA Helmholtz Center for Information Security, Saarbrücken, Germany; Mahalakshmi Sabanayagam EMAIL School of Computation, Information and Technology Technical University of Munich, Germany; Krikamol Muandet EMAIL CISPA Helmholtz Center for Information Security, Saarbrücken, Germany; Debarghya Ghoshdastidar EMAIL School of Computation, Information and Technology Technical University of Munich, Germany
Pseudocode Yes Algorithm 1 Robust Feature Inference (RFI) Require: The model h trained on Dtrain := {(xi, yi)}n i=1 such that h(x) = β ϕ(x) where ϕ : X Rp and β Rp C, and the number of top robust features to select K. 1: Compute the covariance Σtrain 1 nΦΦ where Φ := [ϕ(x1), . . . , ϕ(xn)] Rp n. 2: Compute eigendecomposition of Σtrain = Udiag(λ)U where columns of U are ui Rp. { Top K most robust features U Rp K (3 7)} 3: U {}, βc c-th column of β 4: for c 1 to C do 5: For all i, compute robustness score sc(ui) λi(βT c ui)2 6: U Union( U, ui) if sc(ui) is in top K scores sc(.) 7: end for { Robust Feature Inference on test set Xtest (8 9)} 8: β = U UT β 9: xt Xtest, h(xt) = βT ϕ(xt)
Open Source Code Yes We also open-source our implementation at https://github.com/Anurag14/RFI.
Open Datasets Yes We evaluate RFI on CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), Robust CIFAR-10 (Ilyas et al., 2019), tiny Image Net (Le & Yang, 2015) and Image Net (Russakovsky et al., 2015) datasets.
Dataset Splits Yes In the case of tiny Image Net, we subsampled 100 training samples per class instead of using the full training set for computing the transformation matrix U of the feature covariance due to the computation time, and evaluated the clean and robust performances on the 10, 000 validation samples.
Hardware Specification Yes We use Pytorch Paszke et al. (2019) for all our experiments & a single Nvidia DGX A100 to run all of our experiments.
Software Dependencies No We use Pytorch Paszke et al. (2019) for all our experiments & a single Nvidia DGX A100 to run all of our experiments.
Experiment Setup Yes For training using all these baseline adversarial training methods, we set the batch size as 128. We use SGD with momentum as the optimizer where we set the momentum to 0.1, we also set the weight decay to 0.0002. We run our training for 200 epochs and set the learning rate schedule as starting with 0.1 for the first 100 epochs and then we reduce the learning rate by 90 percent every 50 epochs.