A Dual-Perspective Approach to Evaluating Feature Attribution Methods
Authors: Yawei Li, Yang Zhang, Kenji Kawaguchi, Ashkan Khakzar, Bernd Bischl, Mina Rezaei
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply these metrics to mainstream attribution methods, offering a novel lens through which to analyze and compare feature attribution methods. Our code is provided at https://github.com/sandylaker/soco.git. Through extensive validation and benchmarking, we verify the correctness of the proposed metrics and showcase our metrics potential to shed light on existing attribution methods. |
| Researcher Affiliation | Academia | Yawei Li* EMAIL LMU Munich Munich Center for Machine Learning Yang Zhang* EMAIL National University of Singapore Kenji Kawaguchi EMAIL National University of Singapore Ashkan Khakzar EMAIL University of Oxford Bernd Bischl EMAIL LMU Munich Munich Center for Machine Learning Mina Rezaei EMAIL LMU Munich Munich Center for Machine Learning |
| Pseudocode | Yes | Algorithm 1 Soundness evaluation at predictive level v, Algorithm 2 Completeness evaluation at attribution threshold t, Algorithm 3 Soundness evaluation with accuracy (sm) as performance indicator, Algorithm 4 Completeness Evaluation |
| Open Source Code | Yes | Our code is provided at https://github.com/sandylaker/soco.git. |
| Open Datasets | Yes | We perturb 70% of pixels in each image in the CIFAR-10 (Krizhevsky et al., 2009a) training and test datasets. create two Semi-natural Dataset D(1) S and D(2) S from CIFAR-100 (Krizhevsky et al., 2009b). employ a VGG16 (Simonyan & Zisserman, 2015) pre-trained on Image Net (Deng et al., 2009) and conduct feature attribution on the Image Net validation set. |
| Dataset Splits | Yes | We perturb 70% of pixels in each image in the CIFAR-10 (Krizhevsky et al., 2009a) training and test datasets. employ a VGG16 (Simonyan & Zisserman, 2015) pre-trained on Image Net (Deng et al., 2009) and conduct feature attribution on the Image Net validation set. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used, such as GPU or CPU models. It only mentions training details for the model. |
| Software Dependencies | No | We use the implementations of Grad CAM, Deep SHAP, IG, IG ensembles in Captum (Kokhlikyan et al., 2020). This mentions a software library but does not provide specific version numbers for it or any other key software components used. |
| Experiment Setup | Yes | Training is conducted using Adam (Kingma & Ba, 2015) optimizer with a learning rate of 0.001 and weight decay of 0.0001. The batch size used for the training is 256, and we train a model in 35 epochs. |