reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning

Authors: Riccardo Salami, Pietro Buzzega, Matteo Mosconi, Jacopo Bonato, Luigi Sabetta, Simone Calderara

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTAL STUDY
Researcher Affiliation	Collaboration	1AImage Lab University of Modena and Reggio Emilia, Modena, Italy EMAIL 2Leonardo Labs, Roma, Italy EMAIL
Pseudocode	Yes	The pseudocode for Lo RM, applied to a generic linear layer, is presented in Algorithm 1.
Open Source Code	Yes	The code to reproduce our experiments is available at github.com/aimagelab/fed-mammoth.
Open Datasets	Yes	For in-domain evaluation, we use CIFAR100 (Krizhevsky et al., 2009), Image Net-R (Hendrycks et al., 2021a), and Image Net-A (Hendrycks et al., 2021b); to assess specialization within a single category, we utilize Cars-196 (Krause et al., 2013), and CUB-200 (Wah et al., 2011). Finally, for out-of-domain evaluation, we employ Euro SAT (Helber et al., 2018)... To validate our methodology in a completely different domain, we also include Out-Of-Scope (OOS) (Larson et al., 2019)...
Dataset Splits	Yes	CIFAR-100, Image Net-R, Image Net-A, CUB-200, and Cars-196 are divided into 10 incremental tasks, while Euro SAT is split into 5 due to its fewer classes. Each task includes an equal share, except for Cars-196, where the final task contains 16 classes. The data is distributed across 10 clients using the commonly adopted distribution-based label imbalance setting (Li et al., 2022; Yurochkin et al., 2019), where partitioning is governed by a Dirichlet distribution parameterized by β. A smaller β value corresponds to a more challenging data distribution. We evaluate all methods across three scenarios per dataset, using β {0.5, 0.1, 0.05} for CIFAR-100 and Image Net-R, and β {1, 0.5, 0.2} for the others to account for fewer examples or classes.
Hardware Specification	No	No specific hardware details (GPU/CPU models, memory) are mentioned in the paper for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	Implementation Details. As the backbone for Lo RM and all competing approaches, we employ a Vi T-B/16 model (Dosovitskiy et al., 2021) pre-trained on Image Net-21K (Ridnik et al., 2021) on all vision datasets. We set the number of epochs per communication round to 5 for all datasets, and the total number of rounds to 5 for CIFAR-100, Image Net-R, Euro SAT, and CUB-200. Given the increased difficulty of Image Net-A and Cars-196, all methods are allowed 10 communication rounds on these datasets. For Out-Of-Scope, we use a pre-trained T5-small (Raffel et al., 2020). For a complete overview of the method-specific hyperparameters, refer to appendix D.