reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Understanding the Statistical Accuracy-Communication Trade-off in Personalized Federated Learning with Minimax Guarantees

Authors: Xin Yu, Zelin He, Ying Sun, Lingzhou Xue, Runze Li

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The theoretical result is validated on both synthetic and real-world datasets and its generalizability is verified in a non-convex setting.
Researcher Affiliation	Academia	1 Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA 2 School of Electrical Engineering and Computer Science, The Pennsylvania State University, University Park, PA 16802, USA. Correspondence to: Ying Sun <EMAIL>, Lingzhou Xue <EMAIL>.
Pseudocode	Yes	Algorithm 1 Federated Gradient Descent with K-Step Local Optimization Algorithm 2 Fed CLUP: Federated Learning with Constant Local Update Personalization
Open Source Code	Yes	Details about the experimental setup are available in Section 6.1 and Appendix D, and the complete anonymized codebase is accessible at https://github.com/ZLHe0/fedclup.
Open Datasets	Yes	We use the MNIST, EMNIST, CIFAR10, Sent140 an Celeb A datasets for real data analysis.
Dataset Splits	No	The paper mentions distributing data across clients and using training and testing sets, but does not provide specific details on the percentage splits or sample counts for training, validation, and test sets. For example, Section D.1 states: "For the real datasets, we report training loss and testing accuracy." and "To impose statistical heterogeneity, we distribute the data across clients in a way that each client only has access to a fixed number of classes." These statements indicate the use of splits but lack the concrete, specific information required for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper mentions software like "Py Torch" for implementation but does not specify any version numbers for PyTorch or other libraries. For instance, in Section D.1: "For the MNIST dataset, all models are implemented in Py Torch, and optimization is performed using the stochastic gradient descent (SGD) solver."
Experiment Setup	Yes	For the synthetic dataset, aligned with Theorem 2, we set the global step size γ = (λ+L)/(λL) and the local step size η = (L+λ) 1. For the real dataset, the global learning rate was set to 1/λ, while the local learning rate was set to 0.01, chosen via grid search. For the MNIST dataset, we implemented logistic regression using the SGD solver with 5 epochs, a batch size of 32, and 20 total runs.