SHAP-XRT: The Shapley Value Meets Conditional Independence Testing
Authors: Jacopo Teneggi, Beepul Bharti, Yaniv Romano, Jeremias Sulam
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our results with simulated as well as real imaging data. We now present three experiments, of increasing complexity, that showcase how the SHAP-XRT procedure can be used in practice to explain machine learning predictions, contextualizing the Shapley value from a statistical viewpoint. |
| Researcher Affiliation | Academia | Jacopo Teneggi EMAIL Department of Computer Science, Johns Hopkins University Mathematical Institute for Data Science (MINDS), Johns Hopkins University Beepul Bharti EMAIL Department of Biomedical Engineering, Johns Hopkins University Mathematical Institute for Data Science (MINDS), Johns Hopkins University Yaniv Romano EMAIL Departments of Electrical Engineering and of Computer Science, Technion Israel Institute of Technology Jeremias Sulam EMAIL Department of Biomedical Engineering, Johns Hopkins University Mathematical Institute for Data Science (MINDS), Johns Hopkins University |
| Pseudocode | Yes | Algorithm 1 Shapley Explanation Randomization Test (SHAP-XRT) procedure SHAP-XRT(model f : Rn [0, 1], sample x Rn, feature j [n], subset C [n] \ {j}, test statistic T, number of null draws K N, number of reference samples L N) |
| Open Source Code | Yes | All code to reproduce experiments will be made publicly available. |
| Open Datasets | Yes | Finally, we revisit an experiment from Teneggi et al. (2022a) on the BBBC041 dataset (Ljosa et al., 2012), which comprises 1425 images of healthy and infected human blood smears of size 1200 1600 pixels. (which is publicly available at https://bbbc.broadinstitute.org/BBBC041). |
| Dataset Splits | Yes | We split the original dataset into a training and validation split using an 80/20 ratio, respectively. This way, we train our model on 589 positive and 608 negative images, and validate on 112 positive and 116 negative images. |
| Hardware Specification | Yes | All experiments were run on an NVIDIA Quadro RTX 5000 with 16 GB of RAM memory on a private server with 96 CPU cores. |
| Software Dependencies | Yes | All scripts were run on Py Torch 1.11.0, Python 3.8.13, and CUDA 10.2. |
| Experiment Setup | Yes | We train both models for one epoch on m i.i.d. samples and a batch size of 64. We note that we use Adam (Kingma & Ba, 2014) with learning rate of 0.001, and SGD with learning rate of 0.01 for f CNN and f FCN, respectively, to achieve optimal validation accuracy. We optimize all parameters of the network for 25 epochs using binary-cross entropy loss and Adam optimizer, with a learning rate of 0.0001 and learning rate decay of 0.2 every 10 epochs. |