reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval

Authors: Yan Shuo Tan, Roman Vershynin

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we prove that, in fact, constant step size online stochastic gradient descent (SGD) converges from arbitrary initializations for the non-smooth, non-convex amplitude squared loss objective. In this setting, online SGD is also equivalent to the randomized Kaczmarz algorithm from numerical analysis. Our analysis can easily be generalized to other single index models. It also makes use of new ideas from stochastic process theory, including the notion of a summary state space, which we believe will be of use for the broader ﬁeld of non-convex optimization.
Researcher Affiliation	Academia	Yan Shuo Tan EMAIL Department of Statistics and Data Science National University of Singapore Singapore Roman Vershynin EMAIL Department of Mathematics University of California Irvine, CA 92697-3875, USA
Pseudocode	No	More formally, we form a sequence of signal estimates x(0), x(1), x(2), . . . with the update rule: x(k+1) := x(k) + η sign( a(k), x(k) )b(k) a(k), x(k) a(k). (3) We typically choose η = 1 d. The paper presents the algorithm's update rule as a mathematical equation (3) within a paragraph, not as a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not contain any statement or link indicating that the source code for the methodology described is publicly available.
Open Datasets	No	We make the following assumptions for the rest of the paper. Assumption 1 Assume the following for each positive integer k: (C) ( Gaussian 1 measurements) We have a(k) d Sd 1, where Sd 1 is the unit sphere in Rd. (D) (No noise) We have bk = \| a(k), x \|. The paper defines its own theoretical "measurements" and "signal vectors" based on specific assumptions (e.g., Gaussian measurements) as part of its mathematical model, rather than utilizing external, publicly available datasets for empirical evaluation.
Dataset Splits	No	The paper does not describe any experimental evaluation using a dataset, therefore, no dataset splits are provided.
Hardware Specification	No	The paper is theoretical and focuses on mathematical proofs and analysis of an algorithm's convergence properties. It does not describe any experiments that would necessitate specifying hardware.
Software Dependencies	No	The paper is theoretical and focuses on mathematical proofs and analysis of an algorithm. It does not describe any software dependencies or specific version numbers for implementation.
Experiment Setup	No	The paper is theoretical, presenting mathematical proofs and analysis of stochastic gradient descent. It does not include an empirical experimental setup, thus no hyperparameters or training configurations are provided.