reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Non-Uniform Smoothness for Gradient Descent

Authors: Albert S. Berahas, Lindon Roberts, Fred Roosta

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we provide some brief numerical experiments conﬁrming the global linear rate for Algorithm 1 (with an appropriate choice of Rk and no tuning of the ﬁxed stepsize η = 1) for objectives of the form f(x) = x 2p 2 as in Section 4 and f(x) = x 2p 2p as in Section 5. In all cases, we use d = 10 dimensional problems and starting point x0 = (1, . . . , 1)T and p = 1, . . . , 5.
Researcher Affiliation	Academia	Albert S. Berahas EMAIL Department of Industrial and Operations Engineering University of Michigan; Lindon Roberts EMAIL School of Mathematics and Statistics University of Sydney; Fred Roosta EMAIL School of Mathematics and Physics University of Queensland ARC Training Centre for Information Resilience
Pseudocode	Yes	Algorithm 1 Gradient Descent with LFSO. Input: Starting point x0 Rd, stepsize factor η > 0. 1: for k = 0, 1, 2, . . . do 2: Choose any Rk > 0 3: Set e Rk := max Rk, η L(xk, Rk) f(xk) . (2.3) 4: xk+1 = xk η L(xk, e Rk) f(xk). (2.4)
Open Source Code	No	No explicit statement or link for open-source code was found in the paper.
Open Datasets	No	The numerical experiments use synthetic objective functions (e.g., f(x) = x 2p 2 and f(x) = x 2p 2p) rather than actual datasets. Therefore, no information on publicly available datasets is provided.
Dataset Splits	No	The paper uses synthetic objective functions for numerical experiments, not traditional datasets that would require train/test/validation splits. Therefore, no dataset split information is provided.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments. It only mentions 'CPU time (s)' in the runtime performance graphs without specifying the CPU model or other hardware components.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation.
Experiment Setup	Yes	In all cases, we use d = 10 dimensional problems and starting point x0 = (1, . . . , 1)T and p = 1, . . . , 5. For standard gradient descent with ﬁxed stepsize, we get the results shown in Figure 1. To see suﬃciently fast convergence, some mild tuning of the stepsize η was required. For both objectives, we see that gradient descent achieves a global linear rate for p = 1 as expected, but a clearly sublinear rate for all p > 1, again in line with expectations. When running Algorithm 1, we use Rk = g(xk) 2 = 2 x 2 for f(x) = x 2p 2 (based on the framework of Section 4) and Rk = x for f(x) = x 2p 2p (based on the framework of Section 5).