reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Path Length Bounds for Gradient Descent and Flow

Authors: Chirag Gupta, Sivaraman Balakrishnan, Aaditya Ramdas

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Appendix G.1 we simulate the lower bound constructed in the proof of Theorem 18 with GD. This simulation allows us to verify that the dependence of the path length of the lower bound construction is indeed Ω(κ1/4/ log κ) (and not something larger like Ω( κ)). We observed that the dependence is 3κ1/4/ log κ. Thus the dependence on κ is correct, but the constants are loose. Observe that the function constructed above is not strongly convex. Thus proving a poly(κ) lower bound for strongly convex functions remains an open problem. In Appendix G.2 we simulate the quadratic lower bound constructed in the proof of Theorem 18 and compare the lower bound and upper bound (Theorem 10) with the empirically observed path length.
Researcher Affiliation	Academia	Chirag Gupta EMAIL Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15217, USA Sivaraman Balakrishnan EMAIL Department of Statistics and Data Science Carnegie Mellon University Pittsburgh, PA 15217, USA Aaditya Ramdas EMAIL Department of Statistics and Data Science, Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15217, USA
Pseudocode	No	The paper describes algorithms like Gradient flow (GF): xt = f(xt) and Gradient descent (GD): xk+1 = xk η f(xk) through mathematical equations. However, it does not present these or any other procedures in a structured pseudocode block or a clearly labeled algorithm section.
Open Source Code	No	The paper does not contain an explicit statement about the release of source code for the methodology described, nor does it provide a link to a code repository. It mentions simulations in the appendix but not code availability.
Open Datasets	No	The paper is theoretical, focusing on mathematical bounds for optimization algorithms. It does not use any external datasets for its analysis or simulations; instead, it constructs specific functions for lower bound proofs and simulations. Therefore, no public or open dataset information is provided.
Dataset Splits	No	The paper does not utilize any datasets; therefore, there is no information provided regarding training, test, or validation splits.
Hardware Specification	No	The paper describes simulations conducted in Appendix G (
Software Dependencies	No	The paper mentions "We use MATLAB’s numerical integration function integral, with the Abs Tol parameter set to 10-50" in Appendix G.2. While MATLAB is named, a specific version number is not provided, making it not fully reproducible according to the criteria.
Experiment Setup	Yes	In Appendix G.1: "For each of these functions with different κ values, the step-size for GD and the initialization point are set as described in Section E.2. The path length is computed by adding the lengths of all the updates, until xk 2 10-6. At this point the remaining distance to the origin xk 2 is added to the path length computation. Finally, µ is computed as the effective PKL constant from the iterates actually seen while running the simulation: µ = max t {0,1,...k} f(xt) 2 / 2(f(xt) f )." In Appendix G.2: "The GF path length is computed by performing the path length integral numerically (since we know xt at each point in the case of quadratics). We use MATLAB’s numerical integration function integral, with the Abs Tol parameter set to 10-50. The GD path length is computed by simulating GD with step size η = 1/2a1 and adding the lengths of all the updates, until every component except the last one is smaller than 10-2, ie maxi =d(xk)(i) < 10-2."