FedCBO: Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization
Authors: José A. Carrillo, Nicolás García Trillos, Sixu Li, Yuhua Zhu
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the efficacy of our Fed CBO algorithm compared to other state-of-the-art methods and help validate our methodological and theoretical work. (3) We discretize our continuous dynamics in a reasonable way and fit it into the conventional federated training protocol to obtain a new federated learning algorithm that we call Fed CBO. We conduct extensive numerical experiments to verify the effectiveness and efficacy of our algorithm and demonstrate that it outperforms other state-of-the-art methods in the non-i.i.d. data setting. |
| Researcher Affiliation | Academia | Jos e A. Carrillo EMAIL Mathematical Institute University of Oxford Oxford OX2 6GG, UK; Nicol as Garc ıa Trillos EMAIL Department of Statistics University of Wisconsin-Madison 1300 University Avenue, Madison, Wisconsin 53706, USA; Sixu Li EMAIL Department of Statistics University of Wisconsin-Madison 1300 University Avenue, Madison, Wisconsin 53706, USA; Yuhua Zhu EMAIL Department of Statistics and Data Science, University of California, Los Angeles Los Angeles, California 90095-1554, USA |
| Pseudocode | Yes | Algorithm 1 Fed CBO; Algorithm 2 Local Aggregation(agent j) |
| Open Source Code | Yes | Implementation of our experiments is open sourced at https://github.com/Sixu Li/Fed CBO. |
| Open Datasets | Yes | We follow the approach used in (Ghosh et al., 2020) to create a clustered FL setting that is based on the standard MNIST dataset (Lecun et al., 1998). |
| Dataset Splits | Yes | For training, we randomly partition the total number of training images 60000k into N agent machines so that each agent holds n = 60000k / N images, all coming from the same rotation angle. For inference, we do not split the test data. |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, processor types, memory amounts) are mentioned in the paper for running experiments. |
| Software Dependencies | No | The paper mentions using 'fully connected neural networks' and 'Re LU activation' but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We set the total number of agents N = 1200 and the number of communication rounds T = 100. In each communication round, all agents participate in the training, i.e., |Gn| = N for all n. When an agent trains its own model on its local dataset (local update step in each round), we run τ = 10 epochs of stochastic gradient descent (SGD) with a learning rate of γ = 0.1 and momentum of 0.9. ... Fed CBO: We set the model download budget M = 200. We choose the hyperparameters λ1 = 10, λ2 = 1 and α = 10. For the ε-greedy sampling, we use the decay scheme ε(n) = max{0.5 - 0.01n, 0.1}, i.e., the initial random sample proportion ε = 0.5. |