reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Selective Concept Bottleneck Models Without Predefined Concepts

Authors: Simon Schrodi, Julian Schur, Max Argus, Thomas Brox

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated UCBM on diverse image classification tasks and compared it to relevant baselines. We show that UCBMs outperform prior work and narrow the gap to their black-box counterparts, while relying on substantially fewer concepts globally in their classification (Section 3.1). Then, we demonstrate the interpretability qualitatively as well as through a user study (Section 3.2).
Researcher Affiliation	Academia	Simon Schrodi EMAIL University of Freiburg Julian Schur EMAIL University of Freiburg, Karlsruhe Institute of Technology Max Argus EMAIL University of Freiburg Thomas Brox EMAIL University of Freiburg
Pseudocode	No	The paper describes mathematical formulations for its methods (e.g., Equation 1 for unsupervised concept discovery, Equation 6 for the final interpretable classifier, and Equations 3, 4, 5 for concept selection mechanisms), but does not contain explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured step-by-step procedures in a code-like format.
Open Source Code	Yes	1Code is available at https://github.com/lmb-freiburg/ucbm.
Open Datasets	Yes	The CBMs are evaluated on Image Net (Deng et al., 2009) with a pretrained Res Net-50 V2 (He et al., 2016), CUB (Wah et al., 2011) with Res Net-18 pretrained on CUB, and Places-365 (Zhou et al., 2017) with Res Net-18 pretrained on Places-365.
Dataset Splits	Yes	We report top-1 accuracy on the standard holdout sets throughout our experiments.
Hardware Specification	Yes	All models were trained on a single NVIDIA RTX 2080 GPU and a full training run took from few minutes to a maximum of 1 2 days depending on dataset size and number of concepts \|C\|.
Software Dependencies	No	We trained our UCBMs with Adam (Kingma & Ba, 2015) and cosine annealing learning rate scheduling (Loshchilov & Hutter, 2017) for 20 epochs. Models are provided at https://github.com/pytorch/vision (Image Net), https://github.com/osmr/imgclsmob (CUB), and https://github.com/Trustworthy-ML-Lab/Label-free-CBM (Places-365). While PyTorch is mentioned for external models, its version specific to the authors' implementation is not provided, nor are versions for other libraries.
Experiment Setup	Yes	We trained our UCBMs with Adam (Kingma & Ba, 2015) and cosine annealing learning rate scheduling (Loshchilov & Hutter, 2017) for 20 epochs. We used a learning rate of 0.001 on Image Net and Places-365, and 0.01 on CUB; except for the Jump Re LU for which we set it to 0.08 on CUB. We set α = 0.99 for the elastic net regularization for all variants. We tuned the other hyperparameters (λπ or k, λw, and dropout rate) to yield a good trade-off between performance, sparsity, and fair comparability. Refer to Appendix D for the hyperparameters and to Figure 6 and Appendix G for their effect.