reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Rapid Overfitting of Multi-Pass SGD in Stochastic Convex Optimization

Authors: Shira Vansover-Hager, Tomer Koren, Roi Livni

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We establish a tight bound of Θ(1/(𝜂𝑇) + 𝜂 𝑇) on the population excess risk of multi-pass (without replacement) SGD with stepsize 𝜂over 𝑇steps (see Theorems 3.1 and 3.3 in Section 3). This result applies to both the single-shuffle and multi-shuffle variants, but also more generally to any multi-pass scheme that processes examples according to an arbitrary sequence of permutations. We also prove a similar tight Θ(1/(𝜂𝑇)+𝜂 𝑇) population excess risk bound for with-replacement SGD, that holds after 𝑂(𝑛log 𝑛) steps (see Theorems 3.2 and 3.4).
Researcher Affiliation	Collaboration	1Blavatnik School of Computer Science, Tel Aviv University 2Google Research 3School of Electrical and Computer Engineering, Tel Aviv University.
Pseudocode	Yes	One-pass Stochastic Gradient Descent (SGD). One Pass SGD receives a training set 𝑆= {𝑧1, . . . , 𝑧𝑛}, a step size 𝜂> 0, a suffix averaging parameter 𝜏 [𝑛+ 1] and a first-order oracle 𝑂𝑧(𝑤) 𝑓(𝑤, 𝑧) and proceeds as: Initialize: 𝑤0 = 0 For 𝑡= 1, . . . , 𝑛: 𝑤𝑡+1 = Π𝑊 𝑤𝑡 𝜂𝑂𝑧𝑡(𝑤𝑡) Output: b𝑤𝑛,𝜏= 1 𝑡=𝑛 𝜏+1 𝑤𝑡,
Open Source Code	No	The paper does not contain any explicit statements about open-sourcing code, links to code repositories, or mention of code in supplementary materials.
Open Datasets	No	We study outof-sample performance of multi-pass SGD in the context of the fundamental stochastic convex optimization (SCO) framework. A learning problem in SCO is defined by a convex and 𝐺-Lipschitz loss function 𝑓: 𝑊 𝑍 ℝ, where 𝑍is a sample space endowed with a population distribution Z, and 𝑊is a convex, compact domain with diameter bounded by 𝐷.
Dataset Splits	No	The paper focuses on theoretical analysis and does not present empirical experiments using specific datasets, thus no dataset splits are mentioned.
Hardware Specification	No	The paper is theoretical and does not describe any experimental setup, therefore no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not mention any software libraries, frameworks, or specific version numbers used for implementation or experiments.
Experiment Setup	No	The paper focuses on theoretical contributions to stochastic convex optimization and does not include details on experimental setup, hyperparameters, or training configurations.