Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models

Authors: Susmit Agrawal, Deepika Vemuri, Sri Siddarth Chakaravarthy P, Vineeth N. Balasubramanian

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experimentation, we show that our approach obtains state-of-the-art classification performance compared to other concept-based models, achieving over 2 the classification performance in some cases. We also study the ability of our model to perform interventions on concepts, and show that it can localize visual concepts in input images, providing post-hoc interpretations.
Researcher Affiliation Academia Indian Institute of Technology Hyderabad EMAIL, EMAIL
Pseudocode No The paper describes the methodology using textual descriptions and mathematical formulas, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code https://github.com/Susmit-A/Mu CIL Appendix https://susmit-a.github.io/misc/appendix.pdf
Open Datasets Yes We perform a comprehensive suite of experiments to study the performance of Mu CIL on well-known benchmarks: CIFAR-100, Image Net-100 (INet-100), and Cal Tech-UCSD Birds 200 (CUB200).
Dataset Splits No The paper mentions evaluating performance on a 'validation split' and refers to 'dataset details' in the appendix, but it does not provide specific percentages, sample counts, or explicit splitting methodology for the training, validation, and test sets within the main text.
Hardware Specification No The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments.
Software Dependencies No The paper discusses the use of a transformer architecture and models like GPT 3.5, but it does not explicitly list specific software dependencies (e.g., programming languages, libraries, frameworks) along with their version numbers required to reproduce the experiments.
Experiment Setup Yes In the CL setting, we study our performance over 5 and 10 experiences using concept-based methods in conjunction with three well-known CL algorithms: Experience Replay (ER) (Rebuffi et al. 2017), AGEM (Chaudhry et al. 2019), and DER++ (Buzzega et al. 2020), with a replay buffer size of 500 (we study other variations of buffer size in the Appendix). ... We empirically found λ1 = 5 and λ2 = 10 to give the best performance overall in terms of FAA, LA and grounding similarity, with LA = 0.7722 and cosine similarity 0.998.