FedOne: Query-Efficient Federated Learning for Black-box Discrete Prompt Learning
Authors: Ganyu Wang, Jinjie Fang, Maxwell Juncheng Yin, Bin Gu, Xi Chen, Boyu Wang, Yi Chang, Charles Ling
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted numerical experiments on various aspects of our framework, demonstrating a significant improvement in query efficiency, which aligns with our theoretical results. We conducted numerical experiments on various aspects of our framework, demonstrating a significant improvement in query efficiency, which aligns with our theoretical results. We conducted numerical experiments on various aspects of our framework, demonstrating a significant improvement in query efficiency, which aligns with our theoretical results. |
| Researcher Affiliation | Academia | 1Western University, London, Ontario, Canada 2Jilin University, Changchun, Jilin, China 3Mc Gill University, Montreal, Quebec, Canada 4Vector Institute, Toronto, Ontario, Canada. Correspondence to: Bin Gu <EMAIL>, Charles Ling <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 outlines the Fed-BDPL framework, which integrates federated averaging with local client training with Gumbel-Softmax-BDPL (GS-BDPL). |
| Open Source Code | Yes | The implementation is available at: https://github.com/GanyuWang/FedOne-BDPL. |
| Open Datasets | Yes | To illustrate the intuition behind Fed One, we began with a toy experiment examining the trade-off between query efficiency and the number of activated clients K in a federated learning setting using the MNIST dataset (Le Cun et al., 2010).2 The dataset is evenly distributed across 100 clients. For our experiment, we utilized the GLUE benchmark (Wang et al., 2018), which includes a wide range of tasks including MNLI (Williams et al., 2018), QQP (Iyer et al., 2017), SST-2 (Socher et al., 2013), MRPC (Dolan & Brockett, 2005), Co LA (Warstadt et al., 2019), QNLI (Wang et al., 2018), and RTE (Dagan et al., 2005; Haim et al., 2006; Giampiccolo et al., 2007; Bentivogli et al., 2009). |
| Dataset Splits | Yes | Table 5: The statistics and metrics of seven datasets in GLUE benchmark, |L|: number of classes for classification tasks. Dataset |L| |Train| |Dev| |Test| Type Metrics Domain MNLI 3 393K 9.8K 9.8K NLI acc. fiction, reports QQP 2 364K 40K 391K paraphrase F1 Quora SST-2 2 6.7K 872 1.8K sentiment acc. movie reviews MRPC 2 3.7K 408 1.7K paraphrase F1 news Co LA 2 8.6K 1K 1K acceptability Matthews corr. books, articles QNLI 2 105K 5.5K 5.5K NLI acc. Wikipedia RTE 2 2.5K 277 3K NLI acc. news, Wikipedia |
| Hardware Specification | No | The paper does not explicitly mention specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It only refers to 'GPUs' in a general discussion about computational resources. |
| Software Dependencies | No | The paper mentions 'Ro BERTa-large' as the model architecture and 'Adam W' as the optimization algorithm, and 'Open AI API' for GPT-3.5 Turbo. However, it does not provide specific version numbers for any programming languages, libraries, or other software components. |
| Experiment Setup | Yes | We train for 2 epochs with a learning rate of 0.01 and a batch size of 32, varying the number of active clients, i.e., K {1, 5, 10, 20, 40}. The model for each client is a Multilayer Perceptron (MLP). It includes a flattening input layer, a fully connected layer with 512 neurons and ReLU activation, a dropout layer with 0.2 dropout rate, and a final fully connected layer that outputs to 10 classes via a Softmax function. For the training procedure, we conducted a hyperparameter tuning phase using a grid search approach to explore learning rates of [3e 4, 1e 4, 3e 5, 1e 5]. The batch size was set at 32, and the optimization algorithm employed was Adam W (Loshchilov & Hutter, 2017). For every client, the population size of CMA-ES is set to 20, and the dimension of the lowdimensional vector is set to 500, as recommended by (Sun et al., 2022). |