reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scaffold with Stochastic Gradients: New Analysis with Linear Speed-Up

Authors: Paul Mangold, Alain Oliviero Durmus, Aymeric Dieuleveut, Eric Moulines

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate our theoretical findings on ℓ2 regularized linear and logistic regression. For linear regression, we use make regression function from scikit-learn (Pedregosa et al., 2011) to generate two different datasets... For each value of N, we run both SCAFFOLD and FEDAVG and report the results in Figure 1.
Researcher Affiliation	Academia	1Ecole Polytechnique, CMAP, UMR 7641, France. Correspondence to: Paul Mangold <EMAIL>.
Pseudocode	Yes	We give the pseudo-code of this algorithm in Algorithm 1.
Open Source Code	Yes	The code is available online at https://github.com/ pmangold/scaffold-speed-up.
Open Datasets	No	For linear regression, we use make regression function from scikit-learn (Pedregosa et al., 2011) to generate two different datasets... For logistic regression, we repeat the same procedure with the make classification function with two different seeds. While scikit-learn functions are publicly available tools, the specific generated datasets for this paper's experiments are not provided or linked, nor are they referred to as existing, fixed public benchmarks.
Dataset Splits	Yes	The first dataset is split evenly among the first N/2 clients, while the second one is split evenly across the other half of clients. For logistic regression, we repeat the same procedure with the make classification function with two different seeds. Using this procedure, we generate a regression and a classification task, where each client has 200 records, and where the distribution is heterogeneous.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU or CPU models. It only mentions running experiments without specifying the computational resources.
Software Dependencies	No	The paper mentions using 'scikit-learn' functions for data generation ('make regression', 'make classification') and refers to 'Pedregosa et al. (2011)'. However, it does not specify any version numbers for scikit-learn or any other software dependencies.
Experiment Setup	Yes	In both settings, we run SCAFFOLD with γ = 0.05 and H = 100, T = 100 and N {10, 100, 1000, 10000}. We estimate the gradients using batches of size 10, and compare the result with FEDAVG with the same parameters.