reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning

Authors: Hai-Long Sun, Da-Wei Zhou, Hanbin Zhao, Le Gan, De-Chuan Zhan, Han-Jia Ye

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on seven benchmark datasets validate MOS s state-of-the-art performance.
Researcher Affiliation	Academia	1School of Artificial Intelligence, Nanjing University 2National Key Laboratory for Novel Software Technology, Nanjing University 3College of Computer Science and Technology, Zhejiang University EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper states: "We summarize the inference function with pseudo-code in the last part." However, no explicit pseudocode block or algorithm figure is provided. The inference function is described narratively and with equations, but not in a structured pseudocode format.
Open Source Code	Yes	Code https://github.com/sun-hailong/AAAI25-MOS
Open Datasets	Yes	Dataset: Since PTMs possess extensive knowledge regarding upstream tasks, we follow (Zhou et al. 2024a) to evaluate the performance on CIFAR100 (Krizhevsky, Hinton et al. 2009), CUB200 (Wah et al. 2011), Image Net R (Hendrycks et al. 2021a), Image Net-A (Hendrycks et al. 2021b), object Net (Barbu et al. 2019), Omnibenchmark (Zhang et al. 2022), and VTAB (Zhai et al. 2019).
Dataset Splits	Yes	Dataset split: Following the benchmark setting (Rebuffi et al. 2017; Wang et al. 2022c), we utilize the notation B-m Inc-n to represent class splits, where m indicates the number of classes in the initial task, and n denotes the number of classes in each subsequent incremental task. m = 0 means the total classes are equally divided into each task. For a consistent and fair comparison, we randomly shuffle class orders using a random seed of 1993 before splitting the data. We ensure consistency in the training and testing sets across all methods, following (Zhou et al. 2024a).
Hardware Specification	Yes	We use Py Torch (Paszke et al. 2019) and PILOT (Sun et al. 2023) to implement all models on NVIDIA RTX 4090 with the same network backbone.
Software Dependencies	No	We use Py Torch (Paszke et al. 2019) and PILOT (Sun et al. 2023) to implement all models on NVIDIA RTX 4090 with the same network backbone. While PyTorch is mentioned, no specific version number for PyTorch or any other library is provided in the text.
Experiment Setup	Yes	In MOS, we set the batch size to 48 and train for 20 epochs using the SGD optimizer with momentum. The learning rate is initially set to 0.01 and follows a cosine annealing decay pattern. The projection dimension r in the adapter is set to 16, and the EMA factor parameter α is set to 0.1.