Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning

Authors: Fangwen Wu, Lechao Cheng, Shengeng Tang, Xiaofeng Zhu, Chaowei Fang, Dingwen Zhang, Meng Wang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on commonly used datasets demonstrate the effectiveness of our approach. The source code is available at https://github.com/fwu11/MACIL.git.
Researcher Affiliation Academia 1Zhejiang Lab 2Hefei University of Technology 3Xidian University 4Northwestern Polytechnical University. Correspondence to: Lechao Cheng <EMAIL>, Dingwen Zhang <EMAIL>.
Pseudocode Yes Algorithm 1 Semantic Drift Calibration
Open Source Code Yes The source code is available at https://github.com/fwu11/MACIL.git.
Open Datasets Yes We train and validate our method using four popular CIL datasets. Image Net-R (Hendrycks et al., 2021a)... CIFAR-100 (Krizhevsky, 2009)... CUB-200 (Wah et al., 2011)... Image Net-A (Hendrycks et al., 2021b)...
Dataset Splits Yes We split Image Net-R into 5, 10, and 20 tasks, with each task containing 40, 20, and 10 classes, respectively. CIFAR-100 (Krizhevsky, 2009) is a widely used dataset in CIL, containing 60,000 images across 100 categories. We also split CIFAR-100 into 5, 10, and 20 tasks with each task containing 20, 10, 5 classes, respectively. CUB-200 (Wah et al., 2011) is a fine-grained dataset containing approximately 11,788 images of 200 bird species with detailed class labels. Image Net-A (Hendrycks et al., 2021b)is a real-world dataset consisting of 200 categories, notable for significant class imbalance, with some categories having very few training samples. We split CUB-200 and Image Net-A into 10 tasks with 20 classes each. ... In line with other studies, our evaluation results are based on three trials with three different seeds.
Hardware Specification No The paper does not explicitly state the specific hardware (GPU, CPU models, etc.) used for running its experiments.
Software Dependencies No The paper mentions using 'SGD optimizer' and 'Cosine Annealing scheduler', but does not provide specific version numbers for any software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes We use the SGD optimizer with the initial learning rate set as 0.01 and we use the Cosine Annealing scheduler. We train the first session for 20 epochs and 10 epochs for later sessions. The batch size is set to 48 for all the experiments. ... The distillation loss weight λ is set to 0.4, the Lo RA rank r is set to 32, and the scale s in the angular penalty loss is set to 20.