Debiasing Federated Learning with Correlated Client Participation
Authors: Zhenyu Sun, Ziyang Zhang, Zheng Xu, Gauri Joshi, Pranay Sharma, Ermin Wei
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we provide numerical experiments to illustrate our theoretical results. In particular, we compare vanilla Fed Avg with our proposed algorithm (Algorithm 1) under non-uniform and correlated client participation described in Section 2. For simplicity, we partition the N clients into M groups and exactly one group of clients are selected at each round to fully participate in the system. Here we choose N = 100, M = 20. Synthetic dataset. We test Vanilla Fed Avg and Debiasing Fed Avg (Algorithm 1) under a synthetic dataset constructed following (Sun & Wei, 2022). MNIST dataset. We also test our proposed algorithm under the MNIST dataset. In Figure 3c, we compare Debiasing Fed Avg with Vanilla Fed Avg and Fed VARP (Jhunjhunwala et al., 2022). In Figure 4, Debiasing Fed Avg achieves the highest training accuracy due to its debiasing nature as shown in Theorem 3, while Vanilla Fed Avg and Fed VARP suffer from bias. Moreover, in Table 1, the training and test accuracies for different R are presented. |
| Researcher Affiliation | Collaboration | Northwestern University, Google Research, Carnegie Mellon University EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Debiasing Fed Avg for correlated client participation 1: Input: initial point x0, stepsizes {α}, some τ > 0, λ0 = 0N, ti = 0, i [N] for each client 2: for t = 0, 1, . . . , T do 3: A batch of clients St with size |St| = B is selected. The server sends current t and model xt to clients in St. 4: for i St in parallel do 5: Each client sets ti ti + 1 and calculates λi t = ti (t+1)B and νi t = 1 λi t N . 6: for k = 0, 1, . . . , K 1 do 7: Client i updates its local model by xi t,k+1 = xi t,k ανi t fi(xi t,k). (9) 8: end for 9: end for 10: The server updates its model xt+1 = 1 B P i St xi t,K. 11: end for 12: Output: x T sampled uniformly from {xt}T 1 t=0 |
| Open Source Code | Yes | The code for all experiments can be found through https://github.com/Starrskyy/debias_fl. |
| Open Datasets | Yes | MNIST dataset. We also test our proposed algorithm under the MNIST dataset. ... In this section, we compare Vanilla Fed Avg, Debiasing Fed Avg (ours) and Fed VARP under the CIFAR10 dataset given the same participation pattern as in Section 6. |
| Dataset Splits | No | The paper mentions partitioning N clients into M groups (N=100, M=20) for the simulation setup. It also uses standard datasets like MNIST and CIFAR10, but it does not explicitly state the training/test/validation splits used for these datasets or the synthetic dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not list specific versions for key software dependencies or libraries used in the implementation of the experiments. |
| Experiment Setup | Yes | In this section, we provide numerical experiments to illustrate our theoretical results. In particular, we compare vanilla Fed Avg with our proposed algorithm (Algorithm 1) under non-uniform and correlated client participation described in Section 2. For simplicity, we partition the N clients into M groups and exactly one group of clients are selected at each round to fully participate in the system. Here we choose N = 100, M = 20. ... All learning rates are chosen to be with the order of O(10 3). ... Each client maintains a three-layer fully-connected neural network for training. ... Each client maintains a CNN with three convolution layers. |