Explaining Explainability: Recommendations for Effective Use of Concept Activation Vectors

Authors: Angus Nicolson, Lisa Schut, Alison Noble, Yarin Gal

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments are performed on natural images (Image Net), skin lesions (ISIC 2019), and a new synthetic dataset, Elements. Elements is designed to capture a known ground truth relationship between concepts and classes. We release this dataset to facilitate further research in understanding and evaluating interpretability methods.
Researcher Affiliation Academia Angus Nicolson EMAIL Institute of Biomedical Engineering University of Oxford; Lisa Schut EMAIL OATML, Department of Computer Science University of Oxford; Alison J. Noble EMAIL Institute of Biomedical Engineering University of Oxford; Yarin Gal EMAIL OATML, Department of Computer Science University of Oxford
Pseudocode No The paper describes methods and concepts through mathematical equations and definitions but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The text states: 'We release this dataset to facilitate further research in understanding and evaluating interpretability methods.' This refers to the dataset 'Elements' but does not explicitly state that the source code for the methodology described in the paper is released or provide a link to a code repository.
Open Datasets Yes Our experiments are performed on natural images (Image Net), skin lesions (ISIC 2019), and a new synthetic dataset, Elements. Elements is designed to capture a known ground truth relationship between concepts and classes. We release this dataset to facilitate further research in understanding and evaluating interpretability methods. Image Net (Deng et al., 2009) ISIC 2019 dataset (Tschandl et al., 2018; Codella et al., 2017; Combalia et al., 2019)
Dataset Splits No For the ISIC 2019 dataset, the paper mentions 'training until convergence of validation loss to achieve an area under the receiver operating characteristic curve (AUC) of 0.91 on the validation split.' For the Elements dataset, it mentions 'giving a validation accuracy of 99.98% for the standard dataset.' While these statements imply the use of validation splits, specific percentages or sample counts for training, validation, and test sets are not explicitly provided.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions 'Torch Vision package in Py Torch' and 'Adam optimiser (Kingma & Ba, 2015)' but does not specify version numbers for PyTorch or any other software components, which is required for a reproducible description of ancillary software.
Experiment Setup Yes We train the model using Adam (Kingma & Ba, 2015) with a learning rate of 1e-3 until the training accuracy is greater than 99.99%, giving a validation accuracy of 99.98% for the standard dataset. We finetuned a Vi T-B16 model pretrained on Image Net for 50 epochs on the spatially dependent version of Elements (i.e. there are some classes which depend on the location of the objects as well as which concepts are present). We used an exponentially decaying learning rate with initial learning rate of 0.0001 and a γ of 0.95.