reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimal Rates for Multi-pass Stochastic Gradient Methods

Authors: Junhong Lin, Lorenzo Rosasco

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	9. Numerical Simulations In order to illustrate our theoretical results and the error decomposition, we ﬁrst performed some simulations on a simple problem. We constructed m = 100 i.i.d. training examples of the form y = fρ(xi)+ωi...We perform three experiments... For mini-batch SGM and SGM, the total error... averaged over 50 trials, are depicted in Figures 1a and 1b... For batch GM, the total error... averaged over 50 trials are depicted in Figure 1c... Finally, we tested the simple SGM, mini-batch SGM, and batch GM, using similar step-sizes as those in the ﬁrst simulation, on the Breast Cancer dataset6. The classiﬁcation errors on the training set and the testing set of these three algorithms are depicted in Figure 2.
Researcher Affiliation	Academia	Junhong Lin EMAIL Laboratory for Computational and Statistical Learning Istituto Italiano di Tecnologia and Massachusetts Institute of Technology Bldg. 46-5155, 77 Massachusetts Avenue, Cambridge, MA 02139, USA Lorenzo Rosasco EMAIL DIBRIS, Universit a di Genova Via Dodecaneso, 35 16146 Genova, Italy Laboratory for Computational and Statistical Learning Istituto Italiano di Tecnologia and Massachusetts Institute of Technology Bldg. 46-5155, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
Pseudocode	Yes	Algorithm 1 Let b [m]. Given any sample z, the b-minibatch stochastic gradient method is deﬁned by ω1 = 0 and ωt+1 = ωt ηt 1 b i=b(t 1)+1 ( ωt, xji H yji)xji, t = 1, . . . , T, where {ηt > 0} is a step-size sequence.
Open Source Code	No	The paper includes a license for the document (CC-BY 4.0) but does not contain an explicit statement or link for the release of source code related to the methodology described.
Open Datasets	Yes	Finally, we tested the simple SGM, mini-batch SGM, and batch GM, using similar step-sizes as those in the ﬁrst simulation, on the Breast Cancer dataset6. 6 https://archive.ics.uci.edu/ml/datasets/
Dataset Splits	No	The paper mentions "training set" and "testing set" for the Breast Cancer dataset experiments, but does not provide specific details on how these splits were generated (e.g., percentages, random seed, or specific predefined splits). For the synthetic data, it only states "m = 100 i.i.d. training examples".
Hardware Specification	No	The paper describes numerical simulations and experiments but does not provide any specific details about the hardware used, such as GPU or CPU models, memory, or specific machine configurations.
Software Dependencies	No	The paper describes the algorithms and their performance but does not mention any specific software, libraries, or frameworks along with their version numbers that were used for implementation.
Experiment Setup	Yes	In the ﬁrst experiment, we run mini-batch SGM, where the mini-batch size b = m, and the step-size ηt = 1/(8 m). In the second experiment, we run simple SGM where the step-size is ﬁxed as ηt = 1/(8m), while in the third experiment, we run batch GM using the ﬁxed step-size ηt = 1/8. ... We perform three experiments with the same H, a RKHS associated with a Gaussian kernel K(x, x ) = exp( (x x )2/(2σ2)) where σ = 0.2.