reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhancing Pre-trained Representation Classifiability can Boost its Interpretability

Authors: Shufan Shen, Zhaobo Qi, Junshu Sun, Qingming Huang, Qi Tian, Shuhui Wang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on different datasets (Krizhevsky et al., 2009; Wah et al., 2011; Russakovsky et al., 2015) and pre-trained representations (He et al., 2016; Dosovitskiy et al., 2021; Liu et al., 2021; 2022) reveal a positive correlation between interpretability and classifiability, i.e., representations with higher classifiability provide more interpretable semantics that can be captured in the interpretations.
Researcher Affiliation	Collaboration	1 Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS 2 University of Chinese Academy of Sciences 3 Harbin Institute of Technology, Weihai 4 Peng Cheng Laboratory 5 Huawei Inc.
Pseudocode	No	The paper provides mathematical formulations and descriptions of procedures in text (e.g., Equation 3, Equation 4, Equation 5, Equation 7) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Codes are available at here. Source code has been submitted as supplementary materials.
Open Datasets	Yes	We compute the IIS of representations on Image Net1K (Russakovsky et al., 2015), CUB200 (Wah et al., 2011), CIFAR-10 (Krizhevsky et al., 2009) and CIFAR-100 (Krizhevsky et al., 2009). These datasets cover diverse tasks. We also conduct experiments on video representations (Carreira & Zisserman, 2017; Feichtenhofer et al., 2019; Tong et al., 2022; Wang et al., 2023; Li et al., 2023b;a) in Appendix A.4. The Kinetics-400 dataset (Carreira & Zisserman, 2017). UCF-101 (Soomro, 2012), DTD (Cimpoi et al., 2014) and HAM10000 (Tschandl et al., 2018).
Dataset Splits	No	The paper mentions training sample sizes for some datasets (e.g., "CUB-200 having 5900 training samples, CIFAR datasets 50,000 each, and Image Net1K having 1-2 million training images"), but it does not specify explicit training/validation/test splits (e.g., percentages, exact counts for each split) or refer to how these splits were handled for the experiments (e.g., standard splits with a citation).
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or other computing resource specifications used for running the experiments.
Software Dependencies	No	The paper mentions "Torchvision (Paszke et al., 2019)" as providing pre-trained models, but it does not specify any version numbers for Torchvision or other key software components used in the implementation.
Experiment Setup	Yes	For the fine-tuning process with IIS maximization as the objective, we utilize the sparsity ratio s = 0.1 and vector number M = 200 to compute the simplified IIS. The pre-trained models undergo fine-tuning for 200 epochs (with the first 20 epochs to warmup), utilizing a batch size of 128. The finetuning process incorporates the Adam W optimizer with betas set to (0.9, 0.999), a momentum of 0.9, a cosine decay learning rate scheduler, an initial learning rate of 3e 4, and a weight decay of 0.3. Additional techniques such as label smoothing (0.1) and cutmix (0.2) are also employed. Table A8: Hyperparameters for CBM Training