reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Maintaining Fairness in Logit-based Knowledge Distillation for Class-Incremental Learning

Authors: Zijian Gao, Shanhao Han, Xingxing Zhang, Kele Xu, Dulan Zhou, Xinjun Mao, Yong Dou, Huaimin Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We critically reassess the overlooked sub-optimality of vanilla KD through comprehensive empirical evaluations and analyses, revealing the conflict between learning and anti-forgetting caused by the neglected interplay and rigid exact match-based KD.
Researcher Affiliation	Academia	1College of Computer Science and Technology, National University of Defense Technology, Changsha, China. 2State Key Laboratory of Complex & Critical Software Environment, Changsha, China. 3School of Computer Science, Tsinghua University, Beijing, China.
Pseudocode	Yes	Figure 4 illustrates the schematic diagram of our method and Algorithm 1 provides the pseudo-code of our method in a Py Torch-like (Paszke et al. 2019).
Open Source Code	Yes	Code https://github.com/Zi-Jian-Gao/Maintaining Fairness-in-LKD-for-CIL
Open Datasets	Yes	We conducted experiments in both train from scratch and train from half scenarios on three widely used benchmarks: CIFAR-100 (Krizhevsky and Hinton 2009), Image Net-Subset (Hou et al. 2019), and Tiny Image Net (Le and Yang 2015).
Dataset Splits	Yes	Specifically, we equally divided the classes into five tasks and continuously evaluated the model s accuracy on tasks T0 and T1 at each iteration. In the train from scratch scenario, the model is trained on an equal number of classes in each incremental task, while in the train from half scenario, the model is trained on half the number of all classes in the first task and an equal number of classes in each subsequent task. Following standard CIL practices, we shuffled the class order with a random seed of 1993 (Rebuffi et al. 2017; Zhou et al. 2023).
Hardware Specification	Yes	Our implementation, based on Py Torch (Paszke et al. 2019) and PYCIL (Zhou et al. 2023), ran on an NVIDIA 4090 using Res Net-18 (He et al. 2016) as the model architecture.
Software Dependencies	No	Our implementation, based on Py Torch (Paszke et al. 2019) and PYCIL (Zhou et al. 2023), ran on an NVIDIA 4090 using Res Net-18 (He et al. 2016) as the model architecture.
Experiment Setup	Yes	The models were trained with a batch size of 128 using SGD with momentum. For the baseline methods, the KD weight is set to 1. When using only Linter, the hyperparameters α and β are set to 1 and 0, respectively. To ensure a fair comparison without affecting learning, when both Linter and Lintra are implemented, α and β are each set to 1/2.