reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On lp-hyperparameter Learning via Bilevel Nonsmooth Optimization

Authors: Takayuki Okuno, Akiko Takeda, Akihiro Kawana, Motokazu Watanabe

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 5, we examine the efﬁciency of the proposed algorithm by means of numerical experiments using real data sets. The proposed algorithm is simple and scalable as our numerical comparison to Bayesian optimization and grid search indicates.
Researcher Affiliation	Academia	Takayuki Okuno EMAIL Center for Advanced Intelligence Project, RIKEN Tokyo 103-0027, Japan Akiko Takeda EMAIL Graduate School of Information Science and Technology, The University of Tokyo Tokyo 113-8656, Japan; Center for Advanced Intelligence Project, RIKEN Tokyo 103-0027, Japan Akihiro Kawana EMAIL Department of Industrial Engineering and Economics, Tokyo Institute of Technology Tokyo 152-8550, Japan Motokazu Watanabe EMAIL Department of Mathematical Informatics, The University of Tokyo Tokyo 113-8656, Japan; Present Address: Tokio Marine & Nichido Fire Insurance Co., Ltd., Tokyo, Japan (This research was conducted when he was a student at The University of Tokyo, and is completely irrevalent to the present company.)
Pseudocode	Yes	Algorithm 1 Smoothing Method for Nonsmooth Bilevel Program Algorithm B.1 Implicit function based quasi-Newton method for the smoothed subproblem Algorithm B.2 Modiﬁed Newton-type method for minw ψµ(w)
Open Source Code	No	Algorithm 1 and the other competitor algorithms are implemented with MATLAB R2020a. We use bayesopt in MATLAB with Max Objective Evaluations=30 for Bayesian optimization. In gridsearch, we search for the best value of Avalw bval 2 2 among 30 grids λ = 10 4, 10 4+ 8 29 , , 104 8 29 , 104 for problem (33). At each iteration of bayesopt and gridsearch, we make use of Matlab built-in solver fmincon so as to solve the lower-level problem of (33) with a given λ.
Open Datasets	Yes	The data matrices and vectors A{val,tr,te}, b{val,tr,te} are taken from UCI machine learning repository (Lichman et al., 2013): Facebook Comment Volume ( m = 40949, n = 53), Insurance Company Benchmark ( m = 9000, n = 85), Student Performance for a math exam ( m = 395, n = 272)4, Body Fat ( m = 336, n = 14), and Cpu Small ( m = 8192, n = 12) are from UCI machine learning repository Lichman et al. (2013).
Dataset Splits	Yes	The m samples are divided into 3 groups (training, validation and test samples) with the same sample size m/3 .
Hardware Specification	Yes	All the experiments are conducted on a personal computer with Intel Core i7-8559U CPU @ 2.70GHz, 16.00 GB memory.
Software Dependencies	Yes	Algorithm 1 and the other competitor algorithms are implemented with MATLAB R2020a. We use bayesopt in MATLAB with Max Objective Evaluations=30 for Bayesian optimization. In gridsearch, we search for the best value of Avalw bval 2 2 among 30 grids λ = 10 4, 10 4+ 8 29 , , 104 8 29 , 104 for problem (33). At each iteration of bayesopt and gridsearch, we make use of Matlab built-in solver fmincon so as to solve the lower-level problem of (33) with a given λ.
Experiment Setup	Yes	The smoothing parameter in Algorithm 1 is initialized as µ0 = 1 and updated by µk+1 = min(0.9µk, 10µ1.3 k ). The smoothed subproblem (3) is solved as exactly as possible by ﬁxing (ˆε0, β0) to (10 6, 1). As for the termination criteria of Algorithm 1, writing a resulting solution as w , we stop it if the SB-KKT conditions (9), (10) and (11) are within the error of ϵ := 10 3. We also check whether the other SB-KKT conditions (12)-(14) are satisﬁed. The default setting of bayesopt is employed. Time limits of all the algorithms are set to 600 seconds. In gridsearch, we search for the best value of Avalw bval 2 2 among 30 grids λ = 10 4, 10 4+ 8 29 , , 104 8 29 , 104 for problem (33).