MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning

Authors: Hai-Long Sun, Da-Wei Zhou, Hanbin Zhao, Le Gan, De-Chuan Zhan, Han-Jia Ye

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on seven benchmark datasets validate MOS s state-of-the-art performance.
Researcher Affiliation Academia 1School of Artificial Intelligence, Nanjing University 2National Key Laboratory for Novel Software Technology, Nanjing University 3College of Computer Science and Technology, Zhejiang University EMAIL, EMAIL, EMAIL
Pseudocode No The paper states: "We summarize the inference function with pseudo-code in the last part." However, no explicit pseudocode block or algorithm figure is provided. The inference function is described narratively and with equations, but not in a structured pseudocode format.
Open Source Code Yes Code https://github.com/sun-hailong/AAAI25-MOS
Open Datasets Yes Dataset: Since PTMs possess extensive knowledge regarding upstream tasks, we follow (Zhou et al. 2024a) to evaluate the performance on CIFAR100 (Krizhevsky, Hinton et al. 2009), CUB200 (Wah et al. 2011), Image Net R (Hendrycks et al. 2021a), Image Net-A (Hendrycks et al. 2021b), object Net (Barbu et al. 2019), Omnibenchmark (Zhang et al. 2022), and VTAB (Zhai et al. 2019).
Dataset Splits Yes Dataset split: Following the benchmark setting (Rebuffi et al. 2017; Wang et al. 2022c), we utilize the notation B-m Inc-n to represent class splits, where m indicates the number of classes in the initial task, and n denotes the number of classes in each subsequent incremental task. m = 0 means the total classes are equally divided into each task. For a consistent and fair comparison, we randomly shuffle class orders using a random seed of 1993 before splitting the data. We ensure consistency in the training and testing sets across all methods, following (Zhou et al. 2024a).
Hardware Specification Yes We use Py Torch (Paszke et al. 2019) and PILOT (Sun et al. 2023) to implement all models on NVIDIA RTX 4090 with the same network backbone.
Software Dependencies No We use Py Torch (Paszke et al. 2019) and PILOT (Sun et al. 2023) to implement all models on NVIDIA RTX 4090 with the same network backbone. While PyTorch is mentioned, no specific version number for PyTorch or any other library is provided in the text.
Experiment Setup Yes In MOS, we set the batch size to 48 and train for 20 epochs using the SGD optimizer with momentum. The learning rate is initially set to 0.01 and follows a cosine annealing decay pattern. The projection dimension r in the adapter is set to 16, and the EMA factor parameter α is set to 0.1.