Foiling Explanations in Deep Neural Networks

Authors: Snir Vitrack Tamam, Raz Lapid, Moshe Sipper

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our method s performance on two benchmark datasets CIFAR100 and Image Net using four different pretrained deep-learning models: VGG16-CIFAR100, VGG16-Image Net, Mobile Net-CIFAR100, and Inception-v3-Image Net. We find that the XAI methods can be manipulated without the use of gradients or other model internals. Atta XAI successfully manipulates an image such that several XAI methods output a specific explanation map. To our knowledge, this is the first such method in a black-box setting, and we believe it has significant value where explainability is desired, required, or legally mandatory. The code is available at https://github.com/razla/Foiling-Explanations-in-Deep-Neural-Networks.
Researcher Affiliation Collaboration Snir Vitrack Tamam EMAIL Department of Computer Science Ben-Gurion University, Israel Raz Lapid EMAIL Department of Computer Science Ben-Gurion University, Israel & Deep Keep, Israel Moshe Sipper EMAIL Department of Computer Science Ben-Gurion University, Israel
Pseudocode Yes A schematic of our algorithm is shown in Figure 3, with a full pseudocode provided in Algorithm 1. Algorithm 1 Atta XAI Algorithm 2 Experimental setup (per dataset and model)
Open Source Code Yes The code is available at https://github.com/razla/Foiling-Explanations-in-Deep-Neural-Networks.
Open Datasets Yes We compare our method s performance on two benchmark datasets CIFAR100 and Image Net (Deng et al., 2009).
Dataset Splits Yes Assessing the algorithm over a particular configuration of model, dataset, and explanation technique, involves running it over 100 pairs of randomly selected images. Algorithm 2 Experimental setup (per dataset and model) 1: for i 1 to 100 do 2: Randomly choose a pair of images x and xtarget from dataset
Hardware Specification No The paper mentions using pretrained deep-learning models and running experiments, but does not specify the hardware (e.g., CPU, GPU models, or memory) used for these experiments or for running the Atta XAI algorithm.
Software Dependencies No The generation of the explanations was achieved by using the repository Captum (Kokhlikyan et al., 2020), a unified and generic model interpretability library for Py Torch. While Captum and PyTorch are mentioned, specific version numbers for these software components are not provided.
Experiment Setup Yes In order to balance between the two contradicting terms in Equation 5, we chose hyperparameters that empirically proved to work, in terms of forcing the optimization process to find a solution that satisfies the two objectives, following the work done in (Dombrowski et al., 2019): α = 1e11, β = 1e6 for Image Net, and α = 1e7, β = 1e6 for CIFAR100. After every generation the learning rate was decreased through multiplication by a factor of 0.999. We tested drawing the population samples, both independent and identically distributed (iid) and through Latin hypercube sampling (LHS). Algorithm 1 Atta XAI includes parameters such as G (maximum number of generations), λ (population size), σ (initial standard deviation value), α (explanation loss weight), β (prediction loss weight), ηb x (mean learning rate), and ησ (standard deviation learning rate).