reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Retention & Correction: Test-Time Training for Continual Learning

Authors: Haoran Chen, Micah Goldblum, Zuxuan Wu, Yu-Gang Jiang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that our proposed method can be plugged in to virtually any existing continual learning approach without requiring any modiﬁcations to its training procedure. Speciﬁcally, when integrated with state-of-the-art approaches, ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets, respectively.
Researcher Affiliation	Academia	1Shanghai Key Lab of Intell. Info. Processing, School of CS, Fudan University 2Shanghai Collaborative Innovation Center of Intelligent Visual Computing 3New York University. All authors are affiliated with universities or public research centers, indicating an academic affiliation.
Pseudocode	Yes	Algorithm 1 Adaptive Retention & Correction (ARC)
Open Source Code	Yes	Code is available at Github Link.
Open Datasets	Yes	We conduct our main experiments on two popular benchmark datasets for continual learning: Split CIFAR-100 and Split Imagenet-R. CIFAR-100 is a relatively simple dataset comprising 100 classes. Imagenet-R, on the other hand, includes 200 classes... We also conduct experiments on another challenging benchmark, 5-dataset, where unlike the ﬁrst two, it is a composite of ﬁve distinct image classiﬁcation datasets: CIFAR-10, MNIST, Fashion MNIST, SVHN, and not MNIST.
Dataset Splits	Yes	We use Inc-n to denote the data split setting, where n denotes the number of classes for each incremental stage. For example, Split CIFAR-100 Inc5 means that there are 5 classes for each training step, resulting in a total of 20 tasks. For a fair comparison, we split the classes following the order of Sun et al. (2023).
Hardware Specification	No	No specific hardware details (like GPU/CPU models, processor types, or memory amounts) were mentioned in the paper text.
Software Dependencies	No	The paper mentions using "the repository provided by Sun et al. (2023)" as a toolbox, but does not provide specific software names with version numbers.
Experiment Setup	Yes	Implementation details. We use the repository provided by Sun et al. (2023) to test the effectiveness of our method... Finetune, i Car L, Memo, and Der are equipped with a memory buffer of 2000 and 4000 for Split CIFAR-100 and Split Imagenet-R, respectively, whereas L2P, Dual Prompt, Coda Prompt and SLCA are memory-free. For all tested methods, we utilize a Vi T-b-16 backbone pretrained on Imagenet-21K. We set the training conﬁguration to be the same as provided.