Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning
Authors: Fangwen Wu, Lechao Cheng, Shengeng Tang, Xiaofeng Zhu, Chaowei Fang, Dingwen Zhang, Meng Wang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on commonly used datasets demonstrate the effectiveness of our approach. The source code is available at https://github.com/fwu11/MACIL.git. |
| Researcher Affiliation | Academia | 1Zhejiang Lab 2Hefei University of Technology 3Xidian University 4Northwestern Polytechnical University. Correspondence to: Lechao Cheng <EMAIL>, Dingwen Zhang <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Semantic Drift Calibration |
| Open Source Code | Yes | The source code is available at https://github.com/fwu11/MACIL.git. |
| Open Datasets | Yes | We train and validate our method using four popular CIL datasets. Image Net-R (Hendrycks et al., 2021a)... CIFAR-100 (Krizhevsky, 2009)... CUB-200 (Wah et al., 2011)... Image Net-A (Hendrycks et al., 2021b)... |
| Dataset Splits | Yes | We split Image Net-R into 5, 10, and 20 tasks, with each task containing 40, 20, and 10 classes, respectively. CIFAR-100 (Krizhevsky, 2009) is a widely used dataset in CIL, containing 60,000 images across 100 categories. We also split CIFAR-100 into 5, 10, and 20 tasks with each task containing 20, 10, 5 classes, respectively. CUB-200 (Wah et al., 2011) is a fine-grained dataset containing approximately 11,788 images of 200 bird species with detailed class labels. Image Net-A (Hendrycks et al., 2021b)is a real-world dataset consisting of 200 categories, notable for significant class imbalance, with some categories having very few training samples. We split CUB-200 and Image Net-A into 10 tasks with 20 classes each. ... In line with other studies, our evaluation results are based on three trials with three different seeds. |
| Hardware Specification | No | The paper does not explicitly state the specific hardware (GPU, CPU models, etc.) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'SGD optimizer' and 'Cosine Annealing scheduler', but does not provide specific version numbers for any software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We use the SGD optimizer with the initial learning rate set as 0.01 and we use the Cosine Annealing scheduler. We train the first session for 20 epochs and 10 epochs for later sessions. The batch size is set to 48 for all the experiments. ... The distillation loss weight λ is set to 0.4, the Lo RA rank r is set to 32, and the scale s in the angular penalty loss is set to 20. |