reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Selective Aggregation for Low-Rank Adaptation in Federated Learning

Authors: Pengxin Guo, Shuang Zeng, Yanran Wang, Huijie Fan, Feifei Wang, Liangqiong Qu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on natural language understanding and generation tasks demonstrate the effectiveness of the proposed method. Our code is available at https://github.com/Pengxin-Guo/Fed SA-Lo RA.
Researcher Affiliation	Academia	1 The University of Hong Kong 2 Stanford University 3 Shenyang Institute of Automation, Chinese Academy of Sciences 4 Materials Innovation Institute for Life Sciences and Energy (MILES), HKU-SIRI
Pseudocode	No	The paper includes mathematical derivations and proofs (e.g., Section A.1 Proof of Lemma 1, Section A.2 Proof of Theorem 1) but does not contain any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/Pengxin-Guo/Fed SA-Lo RA.
Open Datasets	Yes	For the natural language understanding tasks, we use the Ro BERTa model (Liu et al., 2019) evaluated on the GLUE benchmark (Wang et al., 2018), including MNLI, SST2, QNLI, QQP, and RTE. For the natural language generation tasks, we employ the LLa MA model (Touvron et al., 2023) evaluated on the GSM8K dataset (Cobbe et al., 2021).
Dataset Splits	Yes	For the natural language understanding tasks, similar to FFA-Lo RA (Sun et al., 2024a), we randomly split the data across three clients for federated learning. We model a non-IID data distribution using a Dirichlet distribution with α = 0.5, i.e., Dir (0.5). ... To investigate the effect of data heterogeneity on model performance, we model an IID partition (Split-1) and two non-IID partitions with Dir (1) and Dir (0.5).
Hardware Specification	Yes	The experiments for Lo RA-based methods are conducted on NVIDIA Ge Force RTX 4090 and 3090 GPUs, while the rs Lo RA-based and Ve RA-based methods are carried out on NVIDIA L40S GPUs.
Software Dependencies	No	Our implementation is based on the Federated Scope-LLM library (Kuang et al., 2023). The experiments for Lo RA-based methods are conducted on NVIDIA Ge Force RTX 4090 and 3090 GPUs, while the rs Lo RA-based and Ve RA-based methods are carried out on NVIDIA L40S GPUs. All experiments are performed with half-precision enabled for efficiency. The paper mentions software tools like Federated Scope-LLM and Hugging Face Transformers, but does not provide specific version numbers for these or other dependencies.
Experiment Setup	Yes	We adopt the SGD optimizer (Ruder, 2016) for all approaches. We set the batch size to 128, local update steps to 10, and total communication rounds to 1000, consistent across all experiments. Similar to Hu et al. (2022), we only apply Lo RA to Wq and Wv in the attention layers in our experiments. The rank r = 8 and scaling factor α = 16 are fixed for all algorithms. We report the best result from experiments run with learning rates η {5E-3, 1E-2, 2E-2, 5E-2, 1E-1}.