FedCBO: Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization

Authors: José A. Carrillo, Nicolás García Trillos, Sixu Li, Yuhua Zhu

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the efficacy of our Fed CBO algorithm compared to other state-of-the-art methods and help validate our methodological and theoretical work. (3) We discretize our continuous dynamics in a reasonable way and fit it into the conventional federated training protocol to obtain a new federated learning algorithm that we call Fed CBO. We conduct extensive numerical experiments to verify the effectiveness and efficacy of our algorithm and demonstrate that it outperforms other state-of-the-art methods in the non-i.i.d. data setting.
Researcher Affiliation Academia Jos e A. Carrillo EMAIL Mathematical Institute University of Oxford Oxford OX2 6GG, UK; Nicol as Garc ıa Trillos EMAIL Department of Statistics University of Wisconsin-Madison 1300 University Avenue, Madison, Wisconsin 53706, USA; Sixu Li EMAIL Department of Statistics University of Wisconsin-Madison 1300 University Avenue, Madison, Wisconsin 53706, USA; Yuhua Zhu EMAIL Department of Statistics and Data Science, University of California, Los Angeles Los Angeles, California 90095-1554, USA
Pseudocode Yes Algorithm 1 Fed CBO; Algorithm 2 Local Aggregation(agent j)
Open Source Code Yes Implementation of our experiments is open sourced at https://github.com/Sixu Li/Fed CBO.
Open Datasets Yes We follow the approach used in (Ghosh et al., 2020) to create a clustered FL setting that is based on the standard MNIST dataset (Lecun et al., 1998).
Dataset Splits Yes For training, we randomly partition the total number of training images 60000k into N agent machines so that each agent holds n = 60000k / N images, all coming from the same rotation angle. For inference, we do not split the test data.
Hardware Specification No No specific hardware details (GPU/CPU models, processor types, memory amounts) are mentioned in the paper for running experiments.
Software Dependencies No The paper mentions using 'fully connected neural networks' and 'Re LU activation' but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We set the total number of agents N = 1200 and the number of communication rounds T = 100. In each communication round, all agents participate in the training, i.e., |Gn| = N for all n. When an agent trains its own model on its local dataset (local update step in each round), we run τ = 10 epochs of stochastic gradient descent (SGD) with a learning rate of γ = 0.1 and momentum of 0.9. ... Fed CBO: We set the model download budget M = 200. We choose the hyperparameters λ1 = 10, λ2 = 1 and α = 10. For the ε-greedy sampling, we use the decay scheme ε(n) = max{0.5 - 0.01n, 0.1}, i.e., the initial random sample proportion ε = 0.5.