reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FedCBO: Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization

Authors: José A. Carrillo, Nicolás García Trillos, Sixu Li, Yuhua Zhu

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate the eﬃcacy of our Fed CBO algorithm compared to other state-of-the-art methods and help validate our methodological and theoretical work. (3) We discretize our continuous dynamics in a reasonable way and ﬁt it into the conventional federated training protocol to obtain a new federated learning algorithm that we call Fed CBO. We conduct extensive numerical experiments to verify the eﬀectiveness and eﬃcacy of our algorithm and demonstrate that it outperforms other state-of-the-art methods in the non-i.i.d. data setting.
Researcher Affiliation	Academia	Jos e A. Carrillo EMAIL Mathematical Institute University of Oxford Oxford OX2 6GG, UK; Nicol as Garc ıa Trillos EMAIL Department of Statistics University of Wisconsin-Madison 1300 University Avenue, Madison, Wisconsin 53706, USA; Sixu Li EMAIL Department of Statistics University of Wisconsin-Madison 1300 University Avenue, Madison, Wisconsin 53706, USA; Yuhua Zhu EMAIL Department of Statistics and Data Science, University of California, Los Angeles Los Angeles, California 90095-1554, USA
Pseudocode	Yes	Algorithm 1 Fed CBO; Algorithm 2 Local Aggregation(agent j)
Open Source Code	Yes	Implementation of our experiments is open sourced at https://github.com/Sixu Li/Fed CBO.
Open Datasets	Yes	We follow the approach used in (Ghosh et al., 2020) to create a clustered FL setting that is based on the standard MNIST dataset (Lecun et al., 1998).
Dataset Splits	Yes	For training, we randomly partition the total number of training images 60000k into N agent machines so that each agent holds n = 60000k / N images, all coming from the same rotation angle. For inference, we do not split the test data.
Hardware Specification	No	No specific hardware details (GPU/CPU models, processor types, memory amounts) are mentioned in the paper for running experiments.
Software Dependencies	No	The paper mentions using 'fully connected neural networks' and 'Re LU activation' but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We set the total number of agents N = 1200 and the number of communication rounds T = 100. In each communication round, all agents participate in the training, i.e., \|Gn\| = N for all n. When an agent trains its own model on its local dataset (local update step in each round), we run τ = 10 epochs of stochastic gradient descent (SGD) with a learning rate of γ = 0.1 and momentum of 0.9. ... Fed CBO: We set the model download budget M = 200. We choose the hyperparameters λ1 = 10, λ2 = 1 and α = 10. For the ε-greedy sampling, we use the decay scheme ε(n) = max{0.5 - 0.01n, 0.1}, i.e., the initial random sample proportion ε = 0.5.