Attribution-based Explanations that Provide Recourse Cannot be Robust

Authors: Hidde Fokkema, Rianne de Heide, Tim van Erven

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further illustrate our main impossibility result with experiments and analytical examples that show cases in which the well-known attribution methods Smooth Grad (Smilkov et al., 2017), Integrated Gradients (Sundararajan et al., 2017), LIME (Ribeiro et al., 2016) and SHAP (Lundberg and Lee, 2017) fail to be recourse sensitive. We also provide an analytical example in which counterfactual explanations fail to be continuous. We then reflect on our impossibility result in Section 4, and discuss possible ways around it.
Researcher Affiliation Academia Hidde Fokkema EMAIL Korteweg-de Vries Institute for Mathematics University of Amsterdam Science Park 107, 1098 XG Amsterdam, The Netherlands Rianne de Heide EMAIL Department of Mathematics Vrije Universiteit Amsterdam De Boelelaan 1111, 1081 HV Amsterdam, The Netherlands Tim van Erven EMAIL Korteweg-de Vries Institute for Mathematics University of Amsterdam Science Park 107, 1098 XG Amsterdam, The Netherlands
Pseudocode No The paper describes methods mathematically and provides analytical examples, but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes All the code to reproduce the experiments and figures in this paper can be found in a Git Hub repository4. github.com/Hidde Fok/recourse-robust-explanations-impossible
Open Datasets No A total of 53 gray scale figures were created from the User Icon picture, found on www.iconarch ive.com. Each figure consists of two components, the person and a background. The figures have varying contrasts between these two components. We labeled each figure by hand according to this contrast.
Dataset Splits No The paper describes the creation and labeling of a custom 'Profile Picture Toy Dataset' and how a threshold parameter was chosen for a perfect classifier, but it does not specify any training, validation, or test dataset splits.
Hardware Specification Yes All experiments were run locally on an Apple Mac Book Pro M1 13", 2020 with 8GB of RAM.
Software Dependencies Yes For LIME we used version 0.2.0.1 and for SHAP version 0.40.0. ... Finally, for some of the picture manipulation we used the scikit-image (van der Walt et al., 2011) package, version 1.0, under the BSD 3-Clause License7.
Experiment Setup Yes The classification function is given by ... The attribution methods based on gradients were calculated analytically. The attributions for the Vanilla Gradients, Smooth Grad and Integrated Gradients are given by ... where we have chosen x0 = 0 as the baseline. ... By increasing the threshold from the minimum value of all quadratic differences to the maximum value, the parameter with the highest accuracy was chosen. This lead to the choice λthres = 5961.34, which achieved perfect accuracy across both classes.