Unlocking Global Optimality in Bilevel Optimization: A Pilot Study

Authors: Quan Xiao, Tianyi Chen

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments corroborate the theoretical findings, demonstrating convergence to global minimum in both cases.
Researcher Affiliation Academia Quan Xiao Rensselaer Polytechnic Institute Troy, NY 12180, United States EMAIL Tianyi Chen Rensselaer Polytechnic Institute Troy, NY 12180, United States EMAIL
Pseudocode Yes Algorithm 1 PBGD in Jacobi fashion ... Algorithm 2 PBGD in Gauss-Seidel fashion
Open Source Code No The paper does not contain any explicit statements about the release of source code or links to a code repository.
Open Datasets No The paper describes generating synthetic datasets for its numerical experiments (e.g., "generate data matrix Xtrn RN m, Xval RN m from Gaussian distribution N(5, 0.01)" in sections H.1 and H.2), but does not provide concrete access information (links, DOIs, citations) to a publicly available or open dataset. The generated data itself is not stated to be made public.
Dataset Splits Yes H.1 Representation Learning: Considering the overparameterized and wide neural network case, we choose N = 30, N = 20, m = 40, n = 10, h = 300. H.2 Data Hyper-cleaning: Considering the overparameterized linear regression with a small clean validation dataset and a large dirty training dataset, we choose N = 100, N = 10, m = 200, n = 10.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments in the 'Numerical Experiments' section or elsewhere.
Software Dependencies No The paper does not provide specific software dependencies (e.g., library names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup Yes H.1 Representation Learning: ...we choose N = 30, N = 20, m = 40, n = 10, h = 300. First, we respectively generate data matrix Xtrn RN m, Xval RN m from Gaussian distribution N(5, 0.01) and N( 3, 0.01)... We select the best stepsizes α, β and the number of inner loop Tk = T by grid search. H.2 Data Hyper-cleaning: ...we choose N = 100, N = 10, m = 200, n = 10. First, we respectively generate data matrix Xtrn RN m, Xval RN m from Gaussian distribution N(5, 0.01) and N( 3, 0.01)...