Do Concept Bottleneck Models Respect Localities?

Authors: Naveen Janaki Raman, Mateo Espinosa Zarlenga, Juyeon Heo, Mateja Jamnik

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our contributions are as follows: we (1) construct three metrics to quantify whether models respect localities found in datasets, (2) conduct thorough experiments across a variety of architectural and training choices to characterize when models respect localities, and (3) propose theoretical models to better understand the impact of dataset construction on locality. We construct experiments to assess whether CBMs respect localities in both (a) controlled testbed where we can understand the impact of various factors and (b) real-world datasets where we can understand the performance of concept-based models in practice. We evaluate concept-based models on two non-synthetic datasets: Caltech-UCSD Birds (CUB) (Wah et al., 2011) and Common Objects in Context (COCO) (Lin et al., 2014).
Researcher Affiliation Academia Naveen Raman EMAIL Carnegie Mellon University; Mateo Espinosa Zarlenga EMAIL University of Cambridge; Juyeon Heo EMAIL University of Cambridge; Mateja Jamnik EMAIL University of Cambridge
Pseudocode No The paper describes an algorithm within the proof of Theorem 6.1, stating 'Our algorithm proceeds as follows:'. However, it is presented in paragraph form using descriptive text and mathematical notation, not as a structured pseudocode block or algorithm block with formal steps.
Open Source Code Yes 1We include our code and dataset construction details here: https://github.com/naveenr414/Spurious-Concepts
Open Datasets Yes We evaluate concept-based models on two non-synthetic datasets: Caltech-UCSD Birds (CUB) (Wah et al., 2011) and Common Objects in Context (COCO) (Lin et al., 2014).
Dataset Splits No For the 1- and 2-object datasets, we construct 256 training examples, while for the 4- and 8-object datasets we construct 1024 training examples. For each dataset we vary the fraction of concept combinations from 25% to 100%. For example, 25% corresponds to sampling 25% of the values for c(i) seen in the training dataset and filtering the training dataset to only use tuples (x(i), c(i), y(i)) with corresponding c(i). 100% corresponds to using the original training dataset. We ensure that perturbations do not impact concept labels by creating a set of data points for each concept combination c. While the paper specifies the number of training examples for synthetic datasets and describes varying fractions of concept combinations, it does not explicitly provide information on test or validation splits for any of the datasets used (synthetic, CUB, or COCO).
Hardware Specification Yes We run all experiments for three seeds on an NVIDIA TITAN Xp GPU, with the total number of hours ranging from 100 to 200. We run experiments on a Debian Linux platform with 4 CPUs and 1 GPU.
Software Dependencies No The paper mentions running experiments on a 'Debian Linux platform' but does not specify any software libraries, frameworks (like PyTorch, TensorFlow), or their version numbers that were used for the implementation.
Experiment Setup Yes We train our synthetic models for 50 epochs, our COCO model for 25 epochs, and our CUB model for 100 epochs, selecting each based on the time at which accuracy and loss reach convergence. For CUB and COCO we select a learning rate of 0.005, while for the synthetic experiments, we select 0.05, selecting this through manual inspection of model performance. We construct a set of concept predictors which vary in depth from 3 to 7 layers. For ℓ2 regularization, we vary the weight decay parameter in {0.0004, 0.004, 0.04}.