reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling

Authors: Minhyuk Seo, Hyunseo Koh, Jonghyun Choi

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical validations on the CIFAR-10/100, CLEAR-10/100, and Image Net-1K datasets demonstrate that the proposed approach outperforms the state-of-the-art methods within the same total budget. Furthermore, we validate its effectiveness in the Multi-modal Concept incremental Learning setup with multimodal large language models, such as LLa VA-1.5-7B.
Researcher Affiliation	Academia	Minhyuk Seo Hyunseo Koh Jonghyun Choi Seoul National University EMAIL, EMAIL, EMAIL
Pseudocode	Yes	illustrating the gradient update procedure of the proposed a L-SAR in Fig. 2 and providing a pseudocode in Sec. A.3. ... Algorithm 1 provides a comprehensive pseudocode for the a L-SAR method.
Open Source Code	Yes	Code is available at https://github.com/snumprlab/budgeted-cl. ... To further facilitate the reproduction, we provide open-source implementations of our proposed method (Sec.3), along with data splits and baseline models used in our experiments (Sec.4), available at https://github.com/snumprlab/budgeted-cl.
Open Datasets	Yes	Empirical validations on the CIFAR-10/100, CLEAR-10/100, and Image Net-1K datasets demonstrate that the proposed approach outperforms the state-of-the-art methods within the same total budget. ... For the dataset, we use CIFAR10/100, CLEAR-10/100, and Image Net-1K. ... We evaluate the Bongard-HOI (Jiang et al., 2022) and Bongard-Open World (Wu et al., 2024) benchmarks.
Dataset Splits	Yes	We evaluate the methods in a conventional disjoint task setup and a newly proposed Gaussian task setup (Shanahan et al., 2021; Wang et al., 2022b; Koh et al., 2023). ... We split the concepts into 5 disjoint tasks for the MCIL setup.
Hardware Specification	Yes	We use the LLa VA-1.5-7B model and train it on NVIDIA A100 80GB GPUs.
Software Dependencies	No	To measure forward/backward FLOPs of the model, we use ptflops1, which is a widely used Python library to calculate FLOPs. FLOPs from other operations were manually calculated.
Experiment Setup	Yes	We set the training hyperparameters as follows (Prabhu et al., 2020; Bang et al., 2021; Koh et al., 2022). For CIFAR-10, CIFAR-100, CLEAR-10, and Image Net, we use batchsize of 16, 16, 16, and 256, respectively, and Adam optimizer with LR of 0.0003 for all datasets and setup. To calculate AAUC, we use an evaluation period of 100 samples for CIFAR-10/100 and CLEAR-10/100, and 8000 samples for Image Net-1K. For data augmentation, we apply Rand Augment (Cubuk et al., 2020) to all CL methods. For hyperparameters, we set all the EMA ratios required for a L-SAR to 0.01 for all datasets. For the values of k and T used in memory retrieval, we use k = 4 and T = 0.125 for all experiments.