Enhancing Performance of Explainable AI Models with Constrained Concept Refinement

Authors: Geyu Liang, Senne Michielssen, Salar Fattahi

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Additionally, we evaluate the practical performance of our proposed framework in generating explainable predictions for image classification tasks across various benchmarks. Compared to existing explainable methods, our approach not only improves prediction accuracy while preserving model interpretability across various large-scale benchmarks but also achieves this with significantly lower computational cost. Empirical evaluation. We conduct experiments on multiple benchmark datasets for image classification tasks to assess the practical effectiveness of our approach (Section 4).
Researcher Affiliation Academia 1Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, US. 2Department of Computer Science, Princeton University, Princeton, NJ, US. Correspondence to: Salar Fattahi <EMAIL>.
Pseudocode Yes Our meta-algorithm for Problem (3), called constrained concept refinement (CCR), is presented in Algorithm 1. Algorithm 1 Constrained Concept Refinement... We formally introduce our algorithm in Algorithm 2. Algorithm 2 CCR for Interpretable Image Classification... In Algorithm 3, we present the pseudo-code for concept dispersion. Algorithm 3 Concept dispersion... Algorithm 4 presents the pseudo-code for the projection and normalization step in Algorithm 2. Algorithm 4 Embedding normalization and projection.
Open Source Code Yes The Python implementation of algorithm can be found here: github.com/lianggeyuleo/CCR.git.
Open Datasets Yes We demonstrate the practical efficacy of CCR on multiple image classification benchmarks including CIFAR 10/100 (Krizhevsky et al., 2009), Image Net (Deng et al., 2009), CUB200 (Wah et al., 2011) and Places365 (Zhou et al., 2017).
Dataset Splits Yes The evaluation is conducted across five image classification benchmarks: CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), Image Net (Deng et al., 2009), CUB-200 (Wah et al., 2011), and Places365 (Zhou et al., 2017). For the CIFAR-10/100 and CUB-200 datasets, we tune CLIP-IP-OMP to match the average sparsity level of s, also referred to as the explanation length or k, used in CCR. For Image Net and Places365, we report the best accuracy achieved by CLIP-IP-OMP across all explanation lengths.
Hardware Specification Yes In our computational environment, using an NVIDIA Tesla V100 GPU, CLIP-IP-OMP remains comparably expensive, requiring 33 hours for k = 50. All experiments reported in this section were performed in Python 3.9 on a Mac Book Pro (14-inch, 2021) equipped with an Apple M1 Pro chip.
Software Dependencies No All experiments reported in this section were performed in Python 3.9 on a Mac Book Pro (14-inch, 2021) equipped with an Apple M1 Pro chip. (Only a programming language version is provided, not specific libraries or solvers with versions.)
Experiment Setup Yes The constraint parameter ρ for CCR is fixed at 0.1 for all experiments. For the results shown in Figure 4, we set d = 10, k = 5, ρ = 0.2, γ = 0.5, and Γ = 1. The first column of Figure 4 illustrates the scenario corresponding to Theorem 3.3, in which only a single input feature x is available. Here, we choose n = 8 and η = 10 2. In the second column of Figure 4, we apply projected gradient descent to minimize Lm, as defined in Equation (6), under the assumption that D is rank-deficient. Specifically, we set n = 8 < d and η = 10 1.