reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BECAME: Bayesian Continual Learning with Adaptive Model Merging

Authors: Mei Li, Yuxiang Lu, Qinyan Dai, Suizhi Huang, Yue Ding, Hongtao Lu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging. Extensive experiments show that our approach outperforms state-of-the-art CL methods and existing merging strategies. Code is available at https: //github.com/limei0818/BECAME. [...] Section 4. Experiments
Researcher Affiliation	Academia	Mei Li * 1 Yuxiang Lu * 1 Qinyan Dai 1 Suizhi Huang 1 Yue Ding 1 Hongtao Lu 1 [...] 1Shanghai Jiao Tong University. Correspondence to: Yue Ding <EMAIL>.
Pseudocode	Yes	Algorithm 1 Pseudo-codes for BECAME
Open Source Code	Yes	Code is available at https: //github.com/limei0818/BECAME.
Open Datasets	Yes	We conduct our experiments on four widelyused benchmarks: 20-Split CIFAR-100 (Krizhevsky et al., 2009), 10-Split CIFAR-100, 25-Split Tiny Image Net (Wu et al., 2017), and 20-Split Mini Image Net (Vinyals et al., 2016).
Dataset Splits	Yes	For GPM-based experiments, the dataset is split into 95% for training and 5% for validation, with no data augmentation applied across all three datasets. In NSCL-based experiments, the entire training dataset is utilized, with data augmentation applied via a random crop with 4-pixel padding and a random horizontal flip. [...] Table 5. Dataset statistics for GPM-based experiments. [...] Table 6. Dataset statistics for NSCL-based experiments.
Hardware Specification	Yes	All experiments are performed on a single NVIDIA Ge Force RTX 4080 GPU.
Software Dependencies	No	The paper mentions using Adam optimizer and EWC for regularization, but does not specify any version numbers for these or other software libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Implementation Details. For the first task, the training process is identical to that of the corresponding baselines. For subsequent tasks t {2, 3, , T}, our method involves two stages as mentioned above. All experiments are repeated with 5 random seeds, and we report the mean and standard deviation of the results. The hyperparameter configurations in both stages are mostly consistent with those of the baselines except subtle adjustments for adapting our methods, with details provided in Appendix B.4 to ensure reproducibility. [...] Hyperparameter settings for each GPM-based baseline. indicates that the parameter value is sourced from the corresponding papers or supplementary materials, while other values are derived from the code. lr 0.01 0.01 0.05 0.05 0.1 0.1 0.1 0.1 lr min 10 5 10 5 5 10 5 5 10 5 10 3 10 3 10 3 10 3 lr patience 6 6 6 6 5 5 5 5 lr factor 2 2 2 2 3 3 3 3 n epochs 200 200 200 200 10 100 10 200 batchsize 64 64 64 64 10 64 10 64 ϵ0 0.97 0.97 0.97 0.97 0.985 0.985 0.985 0.98 ϵ 3 10 3 3 10 3 3 10 3 3 10 3 3 10 4 3 10 4 3 10 4 10 3 α 10 5 1 3