SparsyFed: Sparse Adaptive Federated Learning
Authors: Adriano Guastella, Lorenzo Sani, Alex Iacob, Alessio Mora, Paolo Bellavista, Nic Lane
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper presents Sparsy Fed, a practical federated sparse training method that critically addresses the problems above. ... We evaluate Sparsy Fed with ablation studies on typical cross-device FL datasets, including CIFAR-10/100 (Krizhevsky, 2012) and Speech Commands (Warden, 2018), under various data heterogeneity conditions. |
| Researcher Affiliation | Collaboration | 1 Dipartimento di Informatica Scienza e Ingegneria, Universit a di Bologna 2 Department of Computer Science and Technology, University of Cambridge 3 Flower Labs, UK |
| Pseudocode | Yes | Algorithm 1 Sparse federated training pipeline of Sparsy Fed. ... Algorithm 2 Sparse Client Optimization of Sparsy Fed |
| Open Source Code | Yes | We provide the developed code publicly available in this repository to facilitate result reproducibility and for the community of researchers in the field. |
| Open Datasets | Yes | We selected three datasets to assess Sparsy Fed s performance: CIFAR-10/100 (Krizhevsky, 2012), and Speech Commands (Warden, 2018). |
| Dataset Splits | Yes | The datasets above are distributed among 100 clients and partitioned using the method in Hsu et al. (2019), simulating various degrees of data heterogeneity. The distribution of labels across clients is controlled via a concentration parameter α that rules a Latent Dirichlet Allocation (LDA), where a low α value translates to non-IID distribution and a high value to the IID distribution of labels. Specifically, we refer to data distributions as IID for α = 103 and non-IID for α = 1.0 and α = 0.1. To ensure reproducibility, we fixed the seed to 1337 for the LDA partitioning process. The federated orchestrator randomly sampled 10 clients out of the 100 clients in the population every round. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models or cloud instance types) used for running the experiments are provided in the paper. The paper mentions 'edge devices' and 'constrained hardware' in a general context but not for their experimental setup. |
| Software Dependencies | No | The paper mentions using Py Torch (Paszke et al., 2019) and the Flower framework (Beutel et al., 2022) but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | Each round consisted of one local epoch with a local batch size of 16 samples. The initial learning rate was set to 0.5, gradually decreasing to a final value of 0.01 following Eq. (1). For Sparsy Fed, the exponent for re-parameterization was set to β = 1.25 for the CIFAR10/100 experiments and β = 1.15 for the Speech Commands experiment. The CIFAR experiments were run for 700 rounds, while the Speech Commands experiment was run for 500 rounds. The federated orchestrator randomly sampled 10 clients out of the 100 clients in the population every round. We applied the same target sparsity for all devices in the federation. |