Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning

Authors: Riccardo Salami, Pietro Buzzega, Matteo Mosconi, Jacopo Bonato, Luigi Sabetta, Simone Calderara

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTAL STUDY
Researcher Affiliation Collaboration 1AImage Lab University of Modena and Reggio Emilia, Modena, Italy EMAIL 2Leonardo Labs, Roma, Italy EMAIL
Pseudocode Yes The pseudocode for Lo RM, applied to a generic linear layer, is presented in Algorithm 1.
Open Source Code Yes The code to reproduce our experiments is available at github.com/aimagelab/fed-mammoth.
Open Datasets Yes For in-domain evaluation, we use CIFAR100 (Krizhevsky et al., 2009), Image Net-R (Hendrycks et al., 2021a), and Image Net-A (Hendrycks et al., 2021b); to assess specialization within a single category, we utilize Cars-196 (Krause et al., 2013), and CUB-200 (Wah et al., 2011). Finally, for out-of-domain evaluation, we employ Euro SAT (Helber et al., 2018)... To validate our methodology in a completely different domain, we also include Out-Of-Scope (OOS) (Larson et al., 2019)...
Dataset Splits Yes CIFAR-100, Image Net-R, Image Net-A, CUB-200, and Cars-196 are divided into 10 incremental tasks, while Euro SAT is split into 5 due to its fewer classes. Each task includes an equal share, except for Cars-196, where the final task contains 16 classes. The data is distributed across 10 clients using the commonly adopted distribution-based label imbalance setting (Li et al., 2022; Yurochkin et al., 2019), where partitioning is governed by a Dirichlet distribution parameterized by β. A smaller β value corresponds to a more challenging data distribution. We evaluate all methods across three scenarios per dataset, using β {0.5, 0.1, 0.05} for CIFAR-100 and Image Net-R, and β {1, 0.5, 0.2} for the others to account for fewer examples or classes.
Hardware Specification No No specific hardware details (GPU/CPU models, memory) are mentioned in the paper for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes Implementation Details. As the backbone for Lo RM and all competing approaches, we employ a Vi T-B/16 model (Dosovitskiy et al., 2021) pre-trained on Image Net-21K (Ridnik et al., 2021) on all vision datasets. We set the number of epochs per communication round to 5 for all datasets, and the total number of rounds to 5 for CIFAR-100, Image Net-R, Euro SAT, and CUB-200. Given the increased difficulty of Image Net-A and Cars-196, all methods are allowed 10 communication rounds on these datasets. For Out-Of-Scope, we use a pre-trained T5-small (Raffel et al., 2020). For a complete overview of the method-specific hyperparameters, refer to appendix D.