Selective Aggregation for Low-Rank Adaptation in Federated Learning

Authors: Pengxin Guo, Shuang Zeng, Yanran Wang, Huijie Fan, Feifei Wang, Liangqiong Qu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on natural language understanding and generation tasks demonstrate the effectiveness of the proposed method. Our code is available at https://github.com/Pengxin-Guo/Fed SA-Lo RA.
Researcher Affiliation Academia 1 The University of Hong Kong 2 Stanford University 3 Shenyang Institute of Automation, Chinese Academy of Sciences 4 Materials Innovation Institute for Life Sciences and Energy (MILES), HKU-SIRI
Pseudocode No The paper includes mathematical derivations and proofs (e.g., Section A.1 Proof of Lemma 1, Section A.2 Proof of Theorem 1) but does not contain any explicit pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/Pengxin-Guo/Fed SA-Lo RA.
Open Datasets Yes For the natural language understanding tasks, we use the Ro BERTa model (Liu et al., 2019) evaluated on the GLUE benchmark (Wang et al., 2018), including MNLI, SST2, QNLI, QQP, and RTE. For the natural language generation tasks, we employ the LLa MA model (Touvron et al., 2023) evaluated on the GSM8K dataset (Cobbe et al., 2021).
Dataset Splits Yes For the natural language understanding tasks, similar to FFA-Lo RA (Sun et al., 2024a), we randomly split the data across three clients for federated learning. We model a non-IID data distribution using a Dirichlet distribution with α = 0.5, i.e., Dir (0.5). ... To investigate the effect of data heterogeneity on model performance, we model an IID partition (Split-1) and two non-IID partitions with Dir (1) and Dir (0.5).
Hardware Specification Yes The experiments for Lo RA-based methods are conducted on NVIDIA Ge Force RTX 4090 and 3090 GPUs, while the rs Lo RA-based and Ve RA-based methods are carried out on NVIDIA L40S GPUs.
Software Dependencies No Our implementation is based on the Federated Scope-LLM library (Kuang et al., 2023). The experiments for Lo RA-based methods are conducted on NVIDIA Ge Force RTX 4090 and 3090 GPUs, while the rs Lo RA-based and Ve RA-based methods are carried out on NVIDIA L40S GPUs. All experiments are performed with half-precision enabled for efficiency. The paper mentions software tools like Federated Scope-LLM and Hugging Face Transformers, but does not provide specific version numbers for these or other dependencies.
Experiment Setup Yes We adopt the SGD optimizer (Ruder, 2016) for all approaches. We set the batch size to 128, local update steps to 10, and total communication rounds to 1000, consistent across all experiments. Similar to Hu et al. (2022), we only apply Lo RA to Wq and Wv in the attention layers in our experiments. The rank r = 8 and scaling factor α = 16 are fixed for all algorithms. We report the best result from experiments run with learning rates η {5E-3, 1E-2, 2E-2, 5E-2, 1E-1}.