Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning
Authors: Riccardo Salami, Pietro Buzzega, Matteo Mosconi, Jacopo Bonato, Luigi Sabetta, Simone Calderara
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTAL STUDY |
| Researcher Affiliation | Collaboration | 1AImage Lab University of Modena and Reggio Emilia, Modena, Italy EMAIL 2Leonardo Labs, Roma, Italy EMAIL |
| Pseudocode | Yes | The pseudocode for Lo RM, applied to a generic linear layer, is presented in Algorithm 1. |
| Open Source Code | Yes | The code to reproduce our experiments is available at github.com/aimagelab/fed-mammoth. |
| Open Datasets | Yes | For in-domain evaluation, we use CIFAR100 (Krizhevsky et al., 2009), Image Net-R (Hendrycks et al., 2021a), and Image Net-A (Hendrycks et al., 2021b); to assess specialization within a single category, we utilize Cars-196 (Krause et al., 2013), and CUB-200 (Wah et al., 2011). Finally, for out-of-domain evaluation, we employ Euro SAT (Helber et al., 2018)... To validate our methodology in a completely different domain, we also include Out-Of-Scope (OOS) (Larson et al., 2019)... |
| Dataset Splits | Yes | CIFAR-100, Image Net-R, Image Net-A, CUB-200, and Cars-196 are divided into 10 incremental tasks, while Euro SAT is split into 5 due to its fewer classes. Each task includes an equal share, except for Cars-196, where the final task contains 16 classes. The data is distributed across 10 clients using the commonly adopted distribution-based label imbalance setting (Li et al., 2022; Yurochkin et al., 2019), where partitioning is governed by a Dirichlet distribution parameterized by β. A smaller β value corresponds to a more challenging data distribution. We evaluate all methods across three scenarios per dataset, using β {0.5, 0.1, 0.05} for CIFAR-100 and Image Net-R, and β {1, 0.5, 0.2} for the others to account for fewer examples or classes. |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, memory) are mentioned in the paper for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Implementation Details. As the backbone for Lo RM and all competing approaches, we employ a Vi T-B/16 model (Dosovitskiy et al., 2021) pre-trained on Image Net-21K (Ridnik et al., 2021) on all vision datasets. We set the number of epochs per communication round to 5 for all datasets, and the total number of rounds to 5 for CIFAR-100, Image Net-R, Euro SAT, and CUB-200. Given the increased difficulty of Image Net-A and Cars-196, all methods are allowed 10 communication rounds on these datasets. For Out-Of-Scope, we use a pre-trained T5-small (Raffel et al., 2020). For a complete overview of the method-specific hyperparameters, refer to appendix D. |