LancBiO: Dynamic Lanczos-aided Bilevel Optimization via Krylov Subspace
Authors: Yan Yang, Bin Gao, Ya-xiang Yuan
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This successful trial not only enjoys O(ϵ 1) convergence rate but also demonstrates efficiency in a synthetic problem and two deep learning tasks. 4 NUMERICAL EXPERIMENTS In this section, we conduct experiments in the deterministic setting to empirically validate the performance of the proposed algorithms. We test on a synthetic problem and two deep learning tasks. |
| Researcher Affiliation | Academia | Yan Yang Academy of Mathematics and Systems Science Chinese Academy of Sciences University of Chinese Academy of Sciences EMAIL Bin Gao & Ya-xiang Yuan Academy of Mathematics and Systems Science Chinese Academy of Sciences EMAIL |
| Pseudocode | Yes | Algorithm 1 Sub Bi O, Algorithm 2 Lanc Bi O, Algorithm 3 Lanczos process, Algorithm 4 Dynamic Lanczos subroutine for Lanc Bi O (DLanczos) |
| Open Source Code | Yes | For wider accessibility and application, we have made the code available on https://github.com/UCAS-Yan Yang/Lanc Bi O. |
| Open Datasets | Yes | The data hyper-cleaning task (Shaban et al., 2019), conducted on the MNIST dataset (Le Cun et al., 1998)... Additionally, we also evaluate the performance of bilevel algorithms on Fashion-MNIST (Xiao et al., 2017) and Kuzushiji-MNIST (Clanuwat et al., 2018) datasets... Consider the hyper-parameter selection task on the 20Newsgroup dataset (Grazzi et al., 2020) |
| Dataset Splits | Yes | In the deterministic setting, where we implement all compared methods with full-batch, the training set, the validation set and the test set contain 5000, 5000 and 10000 samples, respectively. In this setting, where we implement all compared methods with mini-batch, the training set, the validation set and the test set contain 20000, 5000 and 10000 samples, respectively. The training set, the validation set and the test set contain 5657, 5657 and 7532 samples, respectively. |
| Hardware Specification | Yes | The experiments are produced on a workstation that consists of two Intel Xeon Gold 6330 CPUs (total 2 × 28 cores), 512GB RAM, and one NVIDIA A800 (80GB memory) GPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | For algorithms that incorporate inner iterations to approximate y or v , we select the inner iteration number from the set {5i | i = 1, 2, 3, 4}. The step size of inner iteration is selected from the set {0.01, 0.1, 1, 10} and the step size of outer iteration is chosen from 5 × 10i | i = 3, 2, 1, 0, 1, 2, 3 . Regarding the Hessian/Jacobian-free algorithm HJFBi O, we set the step size δ = 1 × 10−5 to implement finite difference methods. |