Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence

Authors: Frederik Pahde, Maximilian Dreyer, Moritz Weckbecker, Leander Weber, Christopher J. Anders, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate various CAV methods in terms of their alignment with the true concept direction and their impact on CAV applications, including concept sensitivity testing and model correction for shortcut behavior caused by data artifacts. We demonstrate the benefits of pattern-based CAVs using the Pediatric Bone Age, ISIC2019, and Funny Birds datasets with VGG, Res Net, Re XNet, Efficient Net, and Vision Transformer as model architectures.
Researcher Affiliation Collaboration 1Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute 2Department of Electrical Engineering and Computer Science, Technische Universit at Berlin 3BIFOLD Berlin Institute for the Foundations of Learning and Data
Pseudocode No The paper uses mathematical equations to describe the optimization tasks (Eq. 1, 2, 3) and does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/frederikpahde/pattern-cav
Open Datasets Yes We demonstrate the benefits of pattern-based CAVs using the Pediatric Bone Age, ISIC2019, and Funny Birds datasets with VGG, Res Net, Re XNet, Efficient Net, and Vision Transformer as model architectures.1. ... Specifically, we insert artificial concepts into ISIC2019 (Codella et al., 2018; Tschandl et al., 2018; Combalia et al., 2019)... and a Pediatric Bone Age dataset (Halabi et al., 2019)... Lastly, we use Funny Birds (Hesse et al., 2023)...
Dataset Splits Yes Details for our controlled Clever Hans datasets... train / val / test split: 80%/10%/10%. ... We synthesize 500 training samples and 100 test samples per class, totaling to 5000 training and 1000 test samples. The training set is further split into training/validation splits (90%/10%).
Hardware Specification Yes We ran all model training and correction jobs on GPUs of type NVIDIA Ampere A100 with 40 GB RAM.
Software Dependencies No The paper mentions 'timm' and 'torchvision' as sources for pre-trained models and 'zennit' for attribution heatmaps, but does not provide specific version numbers for these software libraries or other key dependencies like Python or PyTorch.
Experiment Setup Yes Table 4: Model training details including the pre-trained checkpoint, optimizer, learning Rate (LR), number of epochs, and milestones, after which the learning rate is divided by 10. ... Model correction is performed with RR-Cl Ar C for 10 epochs with the initial training learning rate (see Table 4) divided by 10. To balance between classification loss and the added loss term LRR, we weigh the latter term with λ {105, 106, ..., 1010}.