Selective Aggregation for Low-Rank Adaptation in Federated Learning
Authors: Pengxin Guo, Shuang Zeng, Yanran Wang, Huijie Fan, Feifei Wang, Liangqiong Qu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on natural language understanding and generation tasks demonstrate the effectiveness of the proposed method. Our code is available at https://github.com/Pengxin-Guo/Fed SA-Lo RA. |
| Researcher Affiliation | Academia | 1 The University of Hong Kong 2 Stanford University 3 Shenyang Institute of Automation, Chinese Academy of Sciences 4 Materials Innovation Institute for Life Sciences and Energy (MILES), HKU-SIRI |
| Pseudocode | No | The paper includes mathematical derivations and proofs (e.g., Section A.1 Proof of Lemma 1, Section A.2 Proof of Theorem 1) but does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Pengxin-Guo/Fed SA-Lo RA. |
| Open Datasets | Yes | For the natural language understanding tasks, we use the Ro BERTa model (Liu et al., 2019) evaluated on the GLUE benchmark (Wang et al., 2018), including MNLI, SST2, QNLI, QQP, and RTE. For the natural language generation tasks, we employ the LLa MA model (Touvron et al., 2023) evaluated on the GSM8K dataset (Cobbe et al., 2021). |
| Dataset Splits | Yes | For the natural language understanding tasks, similar to FFA-Lo RA (Sun et al., 2024a), we randomly split the data across three clients for federated learning. We model a non-IID data distribution using a Dirichlet distribution with α = 0.5, i.e., Dir (0.5). ... To investigate the effect of data heterogeneity on model performance, we model an IID partition (Split-1) and two non-IID partitions with Dir (1) and Dir (0.5). |
| Hardware Specification | Yes | The experiments for Lo RA-based methods are conducted on NVIDIA Ge Force RTX 4090 and 3090 GPUs, while the rs Lo RA-based and Ve RA-based methods are carried out on NVIDIA L40S GPUs. |
| Software Dependencies | No | Our implementation is based on the Federated Scope-LLM library (Kuang et al., 2023). The experiments for Lo RA-based methods are conducted on NVIDIA Ge Force RTX 4090 and 3090 GPUs, while the rs Lo RA-based and Ve RA-based methods are carried out on NVIDIA L40S GPUs. All experiments are performed with half-precision enabled for efficiency. The paper mentions software tools like Federated Scope-LLM and Hugging Face Transformers, but does not provide specific version numbers for these or other dependencies. |
| Experiment Setup | Yes | We adopt the SGD optimizer (Ruder, 2016) for all approaches. We set the batch size to 128, local update steps to 10, and total communication rounds to 1000, consistent across all experiments. Similar to Hu et al. (2022), we only apply Lo RA to Wq and Wv in the attention layers in our experiments. The rank r = 8 and scaling factor α = 16 are fixed for all algorithms. We report the best result from experiments run with learning rates η {5E-3, 1E-2, 2E-2, 5E-2, 1E-1}. |