reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimum-statistical Collaboration Towards General and Efficient Black-box Optimization

Authors: Wenjie Li, Chi-Hua Wang, Guang Cheng, Qifan Song

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically compare the proposed VHCT algorithm with the existing anytime blackbox optimization algorithms, including T-HOO (the truncated version of HOO), HCT, POO, and PCT (POO + HCT, (Shang et al., 2019)), and Bayesian Optimization algorithm BO (Frazier, 2018) to validate that the proposed variance-adaptive uncertainty quantiﬁer can make the convergence of VHCT faster than non-adaptive algorithms. We run every algorithm for 20 independent trials in each experiment and plot the average cumulative regret with 1-standard deviation error bounds. The experimental details and additional numerical results on other objectives are provided in Appendix E.
Researcher Affiliation	Academia	Wenjie Li EMAIL Department of Statistics, Purdue University Chi-Hua Wang EMAIL Department of Statistics, University of California, Los Angles Guang Cheng EMAIL Department of Statistics, University of California, Los Angles Qifan Song EMAIL Department of Statistics, Purdue University
Pseudocode	Yes	Algorithm 1 Optimum-Statistical Collaboration (OSC) Algorithm 2 VHCT Algorithm (Short Version) Algorithm 3 VHCT Algorithm (Complete) Algorithm 4 Pull Update Algorithm 5 Update Backward
Open Source Code	No	For the implementation of all the algorithms, we utilize the publicly available code of POO and HOO at the link https://rdrr.io/cran/OOR/man/POO.html and the Py XAB library (Li et al., 2023). While the PyXAB library is co-authored by some of the authors of this paper, the paper does not explicitly state that the specific implementation of VHCT (the novel algorithm proposed in this paper) is available as part of this library or elsewhere.
Open Datasets	Yes	We tune the RBF kernel and the L2 regularization parameters when training Support Vector Machine (SVM) on the Landmine dataset (Liu et al., 2007), and the batch size, the learning rate, and the weight decay when training neural networks on the MNIST dataset (Deng, 2012). The dataset is available at http://www.ee.duke.edu/~lcarin/Landmine_Data.zip. The MNIST dataset can be downloaded from http://yann.lecun.com/exdb/mnist/
Dataset Splits	No	The paper mentions using a "training set" and "testing set" for both the Landmine and MNIST datasets. However, it does not provide specific details on how these splits were created, such as percentages, sample counts, or the methodology used for partitioning the data. For example, for the Landmine dataset: "The model is trained on the training set with the selected hyper-parameter and then evaluated on the testing set."
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU specifications, or memory amounts used for running the experiments. It only describes the experimental setup in terms of software and parameters.
Software Dependencies	No	The paper mentions using "the publicly available code of POO and HOO" and "the Py XAB library (Li et al., 2023)" but does not specify version numbers for these software components. For example: "For the implementation of all the algorithms, we utilize the publicly available code of POO and HOO at the link https://rdrr.io/cran/OOR/man/POO.html and the Py XAB library (Li et al., 2023)."
Experiment Setup	Yes	We run every algorithm for 20 independent trials in each experiment and plot the average cumulative regret with 1-standard deviation error bounds. For all the experiments in Section 5 and Appendix E.2, we have used a low-noise setting where ϵt Uniform( 0.05, 0.05) to verify the advantage of VHCT. In general, ρ = 0.75 or ρ = 0.5 are good choices for VHCT and HCT, and ρ = 0.25 is a good choice for T-HOO. Therefore, we use these parameter settings in the real-life experiments and the additional experiments in the next subsection. For POO and PCT, we follow Grill et al. (2015) and use ρmax = 0.9. The unknown bound b is set to be b = 1 for all the algorithms used in the experiments. We tune two hyper-parameters when training SVM, the RBF kernel parameter from [0.01, 10], and the L2 regularization from [1e-4, 10]... We tune three diﬀerent hyper-parameters of SGD to ﬁnd the best hyper-parameter, speciﬁcally, the mini batch-size from [1, 100], the learning rate from [1e-6, 1], and the weight decay from [1e-6, 5e-1].