Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
STAR: Stability-Inducing Weight Perturbation for Continual Learning
Authors: Masih Eskandar, Tooba Imtiaz, Davin Hill, Zifeng Wang, Jennifer Dy
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that STAR consistently improves performance of existing methods by up to 15% across varying baselines, and achieves superior or competitive accuracy to that of stateof-the-art methods aimed at improving rehearsal-based continual learning. Our implementation is available at https://github.com/Gnomy17/STAR_CL. |
| Researcher Affiliation | Collaboration | Masih Eskandar1 , Tooba Imtiaz1, Davin Hill1, Zifeng Wang2 , Jennifer Dy1 1Department of Electrical & Computer Engineering, Northeastern University 2 Google Cloud AI Research Correspondence to EMAIL |
| Pseudocode | Yes | A detailed pseudocode of our training algorithm can be found in Algorithm 1. |
| Open Source Code | Yes | Our implementation is available at https://github.com/Gnomy17/STAR_CL. |
| Open Datasets | Yes | We evaluate STAR on three mainstream CL benchmark datasets. Split-CIFAR10 and Split-CIFAR100 are the CIFAR10/100 datasets (Krizhevsky et al., 2009), split into 5 disjoint tasks of 2 and 20 classes respectively. Split-mini Imagenet is a subsampled version of the Imagenet dataset (Deng et al., 2009), split into 20 disjoint tasks of 5 classes each. |
| Dataset Splits | Yes | Split-CIFAR10 and Split-CIFAR100 are the CIFAR10/100 datasets (Krizhevsky et al., 2009), split into 5 disjoint tasks of 2 and 20 classes respectively. Split-mini Imagenet is a subsampled version of the Imagenet dataset (Deng et al., 2009), split into 20 disjoint tasks of 5 classes each. [...] For each task i, we measure LF G(ฮธti, ฮธt) for t > ti on the test set for task i. |
| Hardware Specification | Yes | All experiments were run on a single Nvidia RTX A6000 GPU. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers for its dependencies. It mentions using model architectures like ResNet18 and EfficientNet b2, but no specific software environment details with versions. |
| Experiment Setup | Yes | For CIFAR10-100 we use a batch size of 32 and 50 epochs per task. For mini Imagenet we use a batch size of 128 and 80 epochs per task. For a full list of hyperparameters, see appendix F. [...] We present our hyperparameters in table 8. |