reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Armijo Line-search Can Make (Stochastic) Gradient Descent Provably Faster

Authors: Sharan Vaswani, Reza Babanezhad Harikandeh

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We strengthen these results and show that if the objective function satisfies a certain nonuniform smoothness condition, GD-LS can result in a faster convergence rate than GD(1/L). In particular, we prove that for convex objectives corresponding to logistic regression and multi-class classification, GD-LS can converge to the optimum at a linear rate, and hence improves over the sublinear convergence of GD(1/L). Furthermore, for non-convex objectives satisfying gradient domination (e.g. those corresponding to the softmax policy gradient in RL or generalized linear models with a logistic link function), GD-LS can match the fast convergence of algorithms tailored for these specific settings.
Researcher Affiliation	Collaboration	1Simon Fraser University 2Samsung AI, Montreal. Correspondence to: Sharan Vaswani <EMAIL>, Reza Babanezhad <EMAIL>.
Pseudocode	Yes	Algorithm 1 GD with Armijo Line-search (GD-LS) 1: Input: θ0, ηmax, c (0, 1), β (0, 1) 2: for t = 0, . . . , T 1 do 3: ηt ηmax 4: while f(θt ηt f(θt)) > f(θt) c ηt f(θt) 2 2 do 5: ηt ηt β 6: end while 7: ηt ηt 8: θt+1 = θt ηt f(θt) 9: end for
Open Source Code	No	The paper does not contain any explicit statement about providing source code or a link to a code repository.
Open Datasets	Yes	Comparing GD-LS with c = 1/2 and ηmax = 108 and GD(1/L) for unregularized logistic regression on the ijcnn dataset (Chang & Lin, 2011).
Dataset Splits	No	The paper mentions using the 'ijcnn dataset' and a 'synthetic separable dataset' but does not provide specific details on how these datasets were split into training, validation, or test sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, used to replicate the experiments.
Experiment Setup	Yes	Figure 1. Comparing GD-LS with c = 1/2 and ηmax = 108 and GD(1/L) for unregularized logistic regression on the ijcnn dataset (Chang & Lin, 2011). Figure 2. Comparing GD-LS with c = 1/2, ηmax = 108 and GD(1/L) for unregularized logistic regression on a synthetic separable dataset with γ = 0.1, n = 104 and d = 200. Figure 3. Comparing GD-LS with c = 1/2, ηmax = 104 and GD(1/L) for GLM on a synthetic dataset with n = 104, d = 200, θ = 1.