reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DSA: Decentralized Double Stochastic Averaging Gradient Algorithm

Authors: Aryan Mokhtari, Alejandro Ribeiro

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The advantages of DSA relative to a group of stochastic and deterministic alternatives in solving a logistic regression problem are then studied in numerical experiments (Section 4). These results demonstrate that DSA is the only decentralized stochastic algorithm that reaches the optimal solution with a linear convergence rate. We further show that DSA outperforms deterministic algorithms when the metric is the number of times that elements of the training set are evaluated. The behavior of DSA for diﬀerent network topologies is also evaluated.
Researcher Affiliation	Academia	Aryan Mokhtari EMAIL Alejandro Ribeiro EMAIL Department of Electrical and Systems Engineering University of Pennsylvania Philadelphia, PA 19104, USA
Pseudocode	Yes	Algorithm 1 DSA algorithm at node n
Open Source Code	No	The paper does not contain any explicit statement about open-source code availability, nor does it provide a link to a code repository.
Open Datasets	Yes	In Section 4.5, we consider a large-scale real dataset for training the classiﬁer. In this section we solve the logistic regression problem in (48) for the protein homology dataset provided in KDD Cup 2004.
Dataset Splits	No	The paper discusses the distribution of sample points across nodes and describes the generation of a synthetic dataset but does not specify explicit training, test, or validation splits for the datasets used.
Hardware Specification	No	The paper mentions 'network of computing agents' for parallel processing but does not specify any particular hardware details (e.g., GPU/CPU models, memory) used for the experiments.
Software Dependencies	No	The paper does not specify any software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers used for implementing the algorithms or conducting experiments.
Experiment Setup	Yes	In our experiments in Sections 4.1-4.4, we use a synthetic dataset where the components of the feature vectors sn,i with label ln,i = 1 are generated from a normal distribution with mean µ and standard deviation σ+, while sample points with label ln,i = 1 are generated from a normal distribution with mean µ and standard deviation σ-. ... regularization parameter λ = 10^-4... For EXTRA and DSA diﬀerent stepsizes are chosen and the best performance for EXTRA and DSA are achieved by α = 5 10^-2 and α = 5 10^-3, respectively... The results are reported for α = 10^-4, α = 10^-3, α = 5 10^-3, and α = 10^-1 when the total number of samples are Q = 5000, Q = 1000, Q = 500, Q = 100, respectively.