reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CLIP-QDA: An Explainable Concept Bottleneck Model

Authors: Rémi Kazmierczak, Eloïse Berthier, Goran Frehse, Gianni Franchi

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical findings show that in instances where the Mo G assumption holds, CLIPQDA achieves similar accuracy with state-of-the-art CBMs. Our explanations compete with existing XAI methods while being faster to compute.
Researcher Affiliation	Academia	Rémi Kazmierczak EMAIL Unité d Informatique et d Ingénierie des Systèmes ENSTA Paris, Institut Polytechnique de Paris Eloïse Berthier EMAIL Unité d Informatique et d Ingénierie des Systèmes ENSTA Paris, Institut Polytechnique de Paris Goran Frehse EMAIL Unité d Informatique et d Ingénierie des Systèmes ENSTA Paris, Institut Polytechnique de Paris Gianni Franchi EMAIL Unité d Informatique et d Ingénierie des Systèmes ENSTA Paris, Institut Polytechnique de Paris
Pseudocode	No	The paper describes methods and mathematical formulations, including Proposition 1 and its proof in Appendix A.5, but does not present any structured pseudocode or algorithm blocks with explicit steps.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository. It refers to third-party tools like 'pytorch-grad-cam (Gildenblat & contributors, 2021)' and 'official author s repository' for LIME and SHAP, but not their own implementation.
Open Datasets	Yes	We evaluate our methods on Image Net (Deng et al., 2009), PASCAL-Part (Donadello & Serafini, 2016), MIT Indoor Scenes dataset (Quattoni & Torralba, 2009), Monu MAI (Lamas et al., 2021) and Flowers102 (Nilsback & Zisserman, 2008). In addition to these well-established datasets, we introduce a custom dataset named Cats/Dogs/Cars dataset (Section A.2). To construct this dataset, we concatenated two widely recognized datasets, namely, the Kaggle Cats and Dogs Dataset (Cukierski, 2013) and the Standford Cars Dataset (Krause et al., 2013).
Dataset Splits	No	The paper mentions evaluating methods on 'the test set' and describes various compositions of the 'Cats/Dogs/Cars' dataset (Complete dataset, Biased setup, Unbiased setup) in Table 1. However, it does not specify explicit training/validation/test split percentages, sample counts for each split, or references to standard splits for any of the datasets used (ImageNet, PASCAL-Part, MIT Indoor Scenes, Monu MAI, Flowers102, Cats/Dogs/Cars).
Hardware Specification	No	This work was performed using HPC resources from GENCI-IDRIS (Grant 2023 AD011014675).
Software Dependencies	No	To compute Grad CAM explanations, we applied the method to the 21th block of the image encoder in CLIP-QDA, using the Py Torch package pytorch-grad-cam (Gildenblat & contributors, 2021). All of our samples are generated using the default parameters for image-level explanations, i.e. an exponential kernel of width of 0.25 on an image segmented using quickshift clustering.
Experiment Setup	Yes	La Bo. For La Bo, we train a La Bo classifier by using Adam optimizer with a learning rate of 0.5, a weight decay of 0, and a batch size of 8192. The resulting explanations follow the resulting weight matrix. Yan et al. For the Yan et al. method, we train a linear classifier by using Adam optimizer with a learning rate of 5 10 3, a weight decay of 10 4, and a batch size of 512. The sample-wise (concept level) explanation results from the product between the concept score and its associated weight. Resnet. To train the Resnet classifier on PASCAL-Part, MIT scenes and Monu MAI, we initialized the network with Image Net pertaining. Then, we trained the network using Adam optimizer with a learning rate of 10 3 for the probe, 10 4 for the backbone, a batch size of 64, a momentum of 0.9 and a weight decay of 10 4. Vi T. To train the Vi T classifier on PASCAL-Part, MIT scenes and Monu MAI, we initialized the network with Image Net pertaining. Then, we trained the network using Adam optimizer with a learning rate of 10 2 for the probe, a frozen backbone, a batch size of 128, a momentum of 0.9, and no weight decay.