Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent

Authors: Dominic Richards, Patrick Rebeschini

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present numerical experiments to show that the qualitative nature of the upper bounds we derive can be representative of real behaviours.
Researcher Affiliation Academia Dominic Richards EMAIL Department of Statistics University of Oxford 24-29 St Giles , Oxford, OX1 3LB Patrick Rebeschini EMAIL Department of Statistics University of Oxford 24-29 St Giles , Oxford, OX1 3LB
Pseudocode No The paper describes the Distributed SGD algorithm using a mathematical formula (1) but does not present it as a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide links to a code repository.
Open Datasets No The paper mentions a simulated dataset: "a simulated data set with a total of N = mn observations {Zi}i [N] are sampled following the experiment within Duchi et al. (2012)." This indicates a simulation procedure based on prior work, not a publicly available dataset provided by the authors or a standard benchmark with specific access information.
Dataset Splits No The paper states, "The data set is then randomly spread across the graph with each node getting m samples." This describes how data is distributed among nodes but does not specify training, validation, or test splits needed for reproduction.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments. It only mentions using a Python library for calculations.
Software Dependencies No The paper mentions using "the lbfgs solver within the Logistic Regression function of the python library scikit (Pedregosa et al., 2011)" but does not provide specific version numbers for scikit or Python.
Experiment Setup Yes Dimension and Monte Carlo estimate size are set to d = 100 and b N = 1000, respectively. We investigate the performance of Distributed SGD in two sample size regimes m = 2 and m = 100. Distributed SGD is run for 15 different time horizons t, between 102 and either 107 or 106.5 for graph sizes n = 32 or n = 102, respectively. All runs are initialised from X1 v = 0 for all v V. Comparisons are made for three choices of the step size, as prescribed in Corollary 6, and for three choices of the graph topology: complete graph (α = 0), grid (α = 1/2), and cycle (α = 1). Specifically, the two fixed step size choices considered are: ρ Const = O(1/ nm), to align with serial single-machine SGD; and ρ Const Net = O((1 σ2(P))/ nm).