reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficiently Escaping Saddle Points in Bilevel Optimization

Authors: Minhui Huang, Xuxing Chen, Kaiyi Ji, Shiqian Ma, Lifeng Lai

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we present the experimental results to demonstrate the eﬃciency of our algorithm. We reformulate the problem in Du et al. (2017) as a bilevel optimization problem (1.1) and then compare our Algorithm 1 with AID-Bi O in Ji et al. (2021). More precisely, we consider the following bilevel optimization problem: min x Rd Φ(x) := f1(x, y (x)), s.t. y (x) = argmin y R f2(x, y). (5.1) Motivated by Du et al. (2017), we construct the following functions. For the upper level function we have [...] In Figure 1 we plot the learning curves of Φ(x) min Φ(x) vs. Iteration number. Our algorithm is denoted as PBO , and AID-Bi O is denoted as BO in Figure 1. Note that each learning curve is nearly a step function which consists of vertical and horizontal line segments. The horizontal segment indicates that the function value does not change and thus we may deduce that the iterates are stuck at a saddle point. Each vertical segment indicates that a perturbation successfully helps the iterate escape the saddle point. We observe that under diﬀerent parameter choices our Algorithm 1 escapes saddle points more eﬃciently than standard bilevel optimization algorithm.
Researcher Affiliation	Academia	Minhui Huang EMAIL Department of Electrical and Computer Engineering University of California, Davis, CA 95616, USA Xuxing Chen EMAIL Department of Mathematics University of California, Davis, CA 95616, USA Kaiyi Ji EMAIL Department of Computer Science and Engineering University at Buﬀalo, Buﬀalo, NY 14260, USA Shiqian Ma EMAIL Department of Computational Applied Mathematics and Operations Research Rice University, Houston, TX 77005, USA Lifeng Lai EMAIL Department of Electrical and Computer Engineering University of California, Davis, CA 95616, USA
Pseudocode	Yes	Algorithm 1 Perturbed Algorithms for Minimax and Bilevel Optimization Problems; Algorithm 2 Inexact NEgative-curvature-Originated-from-Noise Algorithm (i NEON); Algorithm 3 Stoc Bi O with i NEON
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or provide links to a code repository.
Open Datasets	No	In the deterministic case (Section 5.1), the authors construct functions based on Du et al. (2017) rather than using a pre-existing dataset. For the stochastic case (Section 5.2), the paper states: "We ﬁrst generate U in which every entry is sampled from the normal distribution N(0, 1/d), and then generate m matrices {Ai : i = 1, ..., m} where each entry is sampled from standard normal distribution N(0, 1)." This describes synthetic data generation, not the use of an open, pre-existing dataset.
Dataset Splits	No	The paper does not specify training/test/validation splits. For the deterministic case, it constructs functions. For the stochastic case, it describes generating synthetic data and parameters for that generation, but not how that generated data is partitioned into splits for experimentation.
Hardware Specification	No	The paper conducts numerical experiments but does not provide any specific details about the hardware used, such as CPU or GPU models, or memory specifications.
Software Dependencies	No	The paper does not mention any specific software dependencies or their version numbers.
Experiment Setup	Yes	In our experiments, we choose the total number of iterations to be 1000 and all stepsizes to be 0.05 in both Algorithm 1 and AID-Bi O. Following Du et al. (2017), we conduct the comparison using diﬀerent choices of problem parameters. [...] For both algorithms, we set d = 50, r = 5, m = 2000, K = 200, D = 5, α = 0.1, β = 0.03 and the batch-size to be 200. For Algorithm 3, we set T = 5, r = 0.5, ϵ = 0.03, ρφ = 0.001, τ = 0.03, η = 0.1, F = 0.001.