reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Locally Adaptive Federated Learning

Authors: Sohom Mukherjee, Nicolas Loizou, Sebastian U Stich

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theoretical claims by performing illustrative experiments for both i.i.d. non-i.i.d. cases. Our proposed algorithms match the optimization performance of tuned Fed Avg in the convex setting, outperform Fed Avg as well as state-of-the-art adaptive federated algorithms like Fed AMS for non-convex experiments, and come with superior generalization performance.
Researcher Affiliation	Academia	Sohom Mukherjee EMAIL Julius-Maximilians-Universität Würzburg Nicolas Loizou EMAIL Johns Hopkins University Sebastian U. Stich EMAIL CISPA Helmholtz Center for Information Security
Pseudocode	Yes	Algorithm 1 Fed SPS: Federated averaging with fully locally adaptive stepsizes. Algorithm 2 Fed SPS-Normalized: Fully locally adaptive Fed SPS with normalization to account for heterogeneity as suggested in Wang et al. (2021). Algorithm 3 Fed SPS-Global: Fed SPS with global stepsize aggregation
Open Source Code	No	Our code is based on publicly available repositories for SPS and Fed AMS1, and will be made available upon acceptance.
Open Datasets	Yes	For non-i.i.d. experiments with the MNIST dataset (Le Cun et al., 2010), we assign every client samples from exactly two classes of the dataset, the splits being non-overlapping and balanced with each client having same number of samples (Li et al., 2020b). For non-i.i.d. experiments with the CIFAR-10/CIFAR-100 datasets, we use the Dirichlet distribution over classes following the proposal in Hsu et al. (2019). The convex comparison also mentions LIBSVM Chang & Lin (2011) datasets (w8a, mushrooms, ijcnn, phishing, a9a).
Dataset Splits	No	The paper describes how data is distributed among clients for i.i.d. and non-i.i.d. scenarios (e.g., equally splitting data, assigning samples from two classes for MNIST, Dirichlet distribution for CIFAR-10/100). However, it does not explicitly provide the global training, validation, or test split percentages or sample counts for any of these datasets. It only implies their existence through mentions of 'Train Loss' and 'Test Accuracy'.
Hardware Specification	No	The acknowledgments section mentions: 'This work was supported by the Helmholtz Association s Initiative and Networking Fund on the HAICORE@FZJ partition.' While this indicates the use of a computing resource, no specific hardware details such as GPU models, CPU models, or memory specifications are provided.
Software Dependencies	No	Our code is based on publicly available repositories for SPS and Fed AMS1, and will be made available upon acceptance.
Experiment Setup	Yes	For all federated training experiments we have 500 communication rounds (the no. of communication rounds being T/τ as per our notation), 5 local steps on each client (τ = 5, unless otherwise specified for some ablation experiments), and a batch size of 20 (\|B\| = 20). We fix c = 0.5 for all further experiments. Similarly we fix γb = 1, following similar observations as before. We provide details on hyperparameter tuning for Fed Avg and Fed AMS including client learning rate ηl {0.0001, 0.001, 0.01, 0.1, 1.0}, server learning rate η {0.001, 0.01, 0.1, 1.0}, β1 = 0.9, β2 = 0.99, and max stabilization factor ϵ {10-8, 10-4, 10-3, 10-2, 10-1, 100}.