reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Overcoming the Stability Gap in Continual Learning

Authors: Md Yousuf Harun, Christopher Kanan

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In large-scale experiments for both easy and hard CL distributions (e.g., class incremental learning), we demonstrate that our method reduces the stability gap and greatly increases computational efficiency. Our main CIL results are given in Table 1. SGM with rehearsal shows the greatest reduction in the stability gap (S), plasticity gap (P), and continual knowledge gap (CK). It also performs best in other metrics.
Researcher Affiliation	Academia	Md Yousuf Harun EMAIL Rochester Institute of Technology Christopher Kanan EMAIL University of Rochester
Pseudocode	No	The paper describes methods and equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured code-like steps.
Open Source Code	Yes	Code is available at https://yousuf907.github.io/sgmsite
Open Datasets	Yes	For this purpose, we use Image Net-1K pre-trained models (K = 1000). Image Net-1K (Russakovsky et al., 2015) has 1.28 million images from 1000 categories... Places365-LT (Liu et al., 2019) is a long-tailed dataset... Places365-Standard (Zhou et al., 2017) has over 1.8 million training images... CUB-200 (Wah et al., 2011) has RGB images of 200 bird species...
Dataset Splits	Yes	Image Net-1K (Russakovsky et al., 2015) has 1.28 million images from 1000 categories, each with 732 1300 training images and 50 validation images. Places365-LT has 365 classes and 62500 training images with 5 to 4980 images per class. For its test set, we use the Places365-LT validation set from (Liu et al., 2019) which consists of a total of 7300 images with a balanced distribution of 20 images per class. Places365-Standard (Zhou et al., 2017) has over 1.8 million training images from 365 classes... We use the validation set consisting of 100 images per class to test the models. CUB-200 (Wah et al., 2011) has RGB images of 200 bird species with 5994 training images and 5794 test images.
Hardware Specification	Yes	We ran all experiments on the same hardware with a single GPU (NVIDIA RTX A5000).
Software Dependencies	No	The paper mentions using 'Adam W optimizer', 'Deep Speed', and 'One Cycle learning rate scheduler', but does not provide specific version numbers for any software libraries, frameworks, or environments.
Experiment Setup	Yes	For both CIL and IID experiments, we train SGM with rehearsal, vanilla rehearsal, and output layer only using cross-entropy loss for 600 iterations per rehearsal session. During each iteration model is updated on 128 samples. All methods use the same Conv Ne Xt V2 backbone 3, use Adam W optimizer with weight decay of 0.05 and initial learning rates of 10 3 (SGM and vanilla) and 10 2 (output layer only). The learning rate is reduced in earlier layers by a layer-wise decay factor of 0.9. The joint model (upper bound) is trained for 12500 iterations on all data i.e., Image Net-1K and Places365-LT combined using an initial learning rate of 10 4 without a scheduler. For all experiments, we set the rank of the Lo RA weight matrices to 48. In all cases, all metrics are based on Top-1 accuracy (%).