MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning
Authors: Hai-Long Sun, Da-Wei Zhou, Hanbin Zhao, Le Gan, De-Chuan Zhan, Han-Jia Ye
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on seven benchmark datasets validate MOS s state-of-the-art performance. |
| Researcher Affiliation | Academia | 1School of Artificial Intelligence, Nanjing University 2National Key Laboratory for Novel Software Technology, Nanjing University 3College of Computer Science and Technology, Zhejiang University EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper states: "We summarize the inference function with pseudo-code in the last part." However, no explicit pseudocode block or algorithm figure is provided. The inference function is described narratively and with equations, but not in a structured pseudocode format. |
| Open Source Code | Yes | Code https://github.com/sun-hailong/AAAI25-MOS |
| Open Datasets | Yes | Dataset: Since PTMs possess extensive knowledge regarding upstream tasks, we follow (Zhou et al. 2024a) to evaluate the performance on CIFAR100 (Krizhevsky, Hinton et al. 2009), CUB200 (Wah et al. 2011), Image Net R (Hendrycks et al. 2021a), Image Net-A (Hendrycks et al. 2021b), object Net (Barbu et al. 2019), Omnibenchmark (Zhang et al. 2022), and VTAB (Zhai et al. 2019). |
| Dataset Splits | Yes | Dataset split: Following the benchmark setting (Rebuffi et al. 2017; Wang et al. 2022c), we utilize the notation B-m Inc-n to represent class splits, where m indicates the number of classes in the initial task, and n denotes the number of classes in each subsequent incremental task. m = 0 means the total classes are equally divided into each task. For a consistent and fair comparison, we randomly shuffle class orders using a random seed of 1993 before splitting the data. We ensure consistency in the training and testing sets across all methods, following (Zhou et al. 2024a). |
| Hardware Specification | Yes | We use Py Torch (Paszke et al. 2019) and PILOT (Sun et al. 2023) to implement all models on NVIDIA RTX 4090 with the same network backbone. |
| Software Dependencies | No | We use Py Torch (Paszke et al. 2019) and PILOT (Sun et al. 2023) to implement all models on NVIDIA RTX 4090 with the same network backbone. While PyTorch is mentioned, no specific version number for PyTorch or any other library is provided in the text. |
| Experiment Setup | Yes | In MOS, we set the batch size to 48 and train for 20 epochs using the SGD optimizer with momentum. The learning rate is initially set to 0.01 and follows a cosine annealing decay pattern. The projection dimension r in the adapter is set to 16, and the EMA factor parameter α is set to 0.1. |