Avoiding Leakage Poisoning: Concept Interventions Under Distribution Shifts
Authors: Mateo Espinosa Zarlenga, Gabriele Dominici, Pietro Barbiero, Zohreh Shams, Mateja Jamnik
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our analysis reveals a weakness in current state-of-the-art CMs, which we term leakage poisoning, that prevents them from properly improving their accuracy when intervened on for OOD inputs. To address this, we introduce Mix CEM, a new CM that learns to dynamically exploit leaked information missing from its concepts only when this information is in-distribution. Our results across tasks with and without complete sets of concept annotations demonstrate that Mix CEMs outperform strong baselines by significantly improving their accuracy for both in-distribution and OOD samples in the presence and absence of concept interventions. |
| Researcher Affiliation | Collaboration | 1University of Cambridge 2Università della Svizzera Italiana 3IBM Research 4Leap Laboratories Inc. |
| Pseudocode | No | The paper describes methods and objectives mathematically and in prose, and includes graphical models (Figure 6, Figure 7) and training objectives. However, it does not contain a clearly labeled pseudocode or algorithm block with structured steps formatted like code. |
| Open Source Code | Yes | Our code and experiment configs can be found at https://github.com/mateoespinosa/cem |
| Open Datasets | Yes | Datasets We study these questions on the following tasks: (1) CUB (Wah et al., 2011), a bird classification task with 200 classes and 112 concepts selected by Koh et al. (2020), (2) Aw A2 (Xian et al., 2018), an animal classification task with 50 classes and 85 concepts, (3) Celeb A (Liu et al., 2018), a face recognition task with 256 classes and 6 concepts selected by Espinosa Zarlenga et al. (2022), and (4) CIFAR-10 (Krizhevsky et al., 2009), a classification task with 10 classes and with 143 concepts obtained in an unsupervised manner by Oikarinen et al. (2023). |
| Dataset Splits | Yes | Aw A2 The train-validation-test data splits are produced via a random 60%-20%-20% split, and samples are randomly cropped and flipped during training as in CUB. CUB: For this task, and its incomplete version, we use the same train-validation-test splits as in (Koh et al., 2020). |
| Hardware Specification | Yes | We executed all experiments on a shared GPU cluster with four Nvidia Titan Xp GPUs and 40 Intel(R) Xeon(R) E5-2630 v4 CPUs (at 2.20GHz) with 125GB of RAM. |
| Software Dependencies | Yes | Our experiments were run on Py Torch 1.11.0 (Paszke et al., 2019) and facilitated by Py Torch Lightning 1.9.5 (Falcon, 2019). For our plots, we used matplotlib 3.5.1 (Hunter, 2007) and the open-sourced distribution of draw.io. |
| Experiment Setup | Yes | During training, we use the standard categorical cross-entropy loss as Ltask. ...we use a batch size of 64 for all CUB-based tasks... we use a batch size of 512 for all other tasks. Similarly, when possible, we fix the initial learning rate lr to values used by previous works and decay it during training by a factor of 10 if the training loss reaches a plateau after 10 epochs. Specifically, we use lr = 0.01 for all tasks except for Celeb A, where we use lr = 0.05... we use a weight decay 0.000004... All models were trained for a total of E epochs, where E = 150 for all datasets except for CIFAR10, where it is E = 50. We use early stopping by tracking the validation loss and stopping training if an improvement in validation loss has not been seen after (patience) (val freq) epochs, where patience = 5 and val freq, the frequency at which we evaluate our model on the validation set, is val freq = 5. |