reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust Distributed Accelerated Stochastic Gradient Methods for Multi-Agent Networks

Authors: Alireza Fallah, Mert Gürbüzbalaban, Asuman Ozdaglar, Umut Şimşekli, Lingjiong Zhu

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct several experiments to validate our theory and assess the performance of D-SG and D-ASG. We consider a (regularized) logistic regression problem which is a common formulation to solve binary classification tasks: i=1 log 1 + exp yi X i x + λ x 2 2 where (Xi, yi) denotes a data pair: Xi Rd is the feature vector and yi { 1, 1} denotes the label, and n denotes the number of data pairs. In all our experiments, we assume that each computation node has access to a subset of all data points, and the noisy gradient in (6) and (13) basically becomes the stochastic gradient that is computed on a random sub-sample of the data. More precisely, we will assume that at each iteration, each computation node will draw a random sub-sample from the data points that it has access to, and compute the stochastic gradient by using this subsample.
Researcher Affiliation	Academia	Alireza Fallah* EMAIL Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Cambridge, MA 02139, United States of America Mert G urb uzbalaban* EMAIL Department of Management Science and Information Systems Departments of Electrical Engineering and Statistics Rutgers University Piscataway, NJ 08854, United States of America Asuman Ozdaglar* EMAIL Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Cambridge, MA 02139, United States of America Umut S im sekli* EMAIL INRIA D epartement d Informatique de l Ecole Normale Sup erieure PSL Research University Paris, France Lingjiong Zhu* EMAIL Department of Mathematics Florida State University Tallahassee, FL 32306, United States of America
Pseudocode	Yes	Algorithm 1: Distributed Multistage Accelerated Stochastic Gradient Algorithm (D-MASG) Input : Initial iterate x(0), The sequence {αi}T i=1 of stepsizes, The sequence {ki}T i=1 of length of stages. Set x(0,k0) = x(0); for t = 1; t T; t = t + 1 do Set x(t, 1) = x(t,0) = x(t 1,kt 1); for m = 0; m kt 1; m = m + 1 do Set βt = 1 µαt 1+ µαt; Set y(t,m) = (1 + βt)x(t,m) βtx(t,m 1); Set x(t,m+1) = Wy(t,m) αt f y(t,m)
Open Source Code	Yes	In this set of experiments, we simulate the distributed environment in MATLAB and we provide our implementation in the supplementary material.
Open Datasets	Yes	The MNIST data set contains 70K binary images (of size d = 20 20) corresponding to 10 different digits.7 To obtain a binary classification problem, we extract the images corresponding to the digits 0 and 8, where we end up with n = 11774 images in total. On the other hand, the Epsilon data set is one of the standard binary classification data sets,8 and contains n = 400K samples with d = 2000. 7. http://yann.lecun.com/exdb/mnist. 8. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html.
Dataset Splits	No	The paper mentions a "batch proportion b = 0.1" used for computing stochastic gradients, but this refers to sub-sampling for gradient estimation, not explicit training/validation/test dataset splits. There is no mention of specific percentages, counts, or methodologies for partitioning the datasets into training, validation, or test sets.
Hardware Specification	No	We have implemented all the algorithms in C++ by using a low-level message passing protocol for parallel processing, namely the Open MPI library.9 In order to have a realistic experimental environment, we have conducted these experiments on a cluster interconnected computers, each of which is equipped with different quality CPUs and memories.
Software Dependencies	No	In this set of experiments, we simulate the distributed environment in MATLAB and we provide our implementation in the supplementary material. We have implemented all the algorithms in C++ by using a low-level message passing protocol for parallel processing, namely the Open MPI library.9 Footnote 9: https://www.open-mpi.org. The paper mentions MATLAB, C++, and the Open MPI library, but it does not specify version numbers for any of these software components.
Experiment Setup	Yes	Unless stated otherwise, we first generate n = 1000 data points, set the dimension d = 100, data variance σ2 X = 5, λ = 0.05, the number of nodes N = 10, the batch proportion b = 0.1, α = 0.1, and we consider the circular network architecture. In this problem, we first compute an estimate for L and µ from the data, where we obtain L ≈ 50 and µ ≈ 0.1. Then, we vary L from 5 to 500 by fixing µ = 0.1, and we vary µ from 0.01 to 1 by fixing L = 50. Accordingly, we run the algorithms with hyperparameters that are computed with these values for L and µ. For D-SG we set the stepsize α = (1 + λW N )/L, whereas for D-ASG we set α = λW N /L and β = 1 αµ 1+ αµ. Finally, for D-MASG we set κ = L+µ µλW N as in (38) and use the setting reported in Proposition 14.