An Empirical Study of Bayesian Optimization: Acquisition Versus Partition

Authors: Erich Merrill, Alan Fern, Xiaoli Fern, Nima Dolatnia

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The primary contribution of this work is to conduct a comprehensive, repeatable evaluation within a common software framework, which we provide as an open-source package. Our results give strong evidence about the relative performance of these methods and reveal a consistent top performer, even when accounting for overall computation time.
Researcher Affiliation Academia Erich Merrill EMAIL Alan Fern EMAIL Xiaoli Fern EMAIL Nima Dolatnia EMAIL School of EECS, Oregon State University
Pseudocode Yes Pseudocode for the ABO algorithm is provided in Algorithm 1. In this paper, we consider two of the most popular acquisition functions, namely Expected Improvement (EI) (Mockus et al., 1978) and Gaussian Process Upper Confidence Bound (GP-UCB) (Srinivas et al., 2010). Algorithm 1 Bayesian optimization Process Algorithm 2 Simultaneous Optimistic Optimization Algorithm 3 Simultaneous Optimistic Optimization (simplified) Algorithm 4 Locally Oriented Global Optimization Algorithm 5 Bayesian Multi-Scale Optimistic Optimization Algorithm 6 Infinite-Metric GP Optimization
Open Source Code Yes To perform these experiments, we built a custom, extendable C++ framework that executes easily-repeatable evaluations of arbitrary black-box optimization algorithms. The source code is publicly available at https://github.com/Eiii/opt_cmp.
Open Datasets Yes We chose to exclusively use synthetic benchmark functions as objectives for this evaluation to ensure that each benchmark could sufficiently evaluate each objective with the time and resources available. The benchmark functions were selected with the goals of including functions with a wide range of quantitative and qualitative properties and including functions that the authors of the partitioning-based methods used to evaluate their algorithms. The LOGO paper evaluates its algorithm on the sin 2, branin, Rosenbrock, Hartmann3, Hartmann6, and Shekel (with m = 5, 7, 10) functions (Kawaguchi et al., 2016) with varying dimensionality and parameterizations when possible. IMGPO and Ba MSOO are also both evaluated on a subset of these objectives (Kawaguchi et al., 2015) (Wang et al., 2014), so we chose to include all of them in our experiments to allow for a more direct comparison between results. Additionally, we chose to include the Rastrigin, Schwefel, and Ackley test functions (Molga and Smutnicki, 2005) in our evaluation to provide more variety among the objectives.
Dataset Splits No The paper uses synthetic benchmark functions and evaluates algorithms on these functions by running them for a certain number of 'samples' or 'function evaluations' (e.g., 'to a horizon of at least 500 samples'). It does not describe traditional training/test/validation dataset splits, which are typically used in supervised learning contexts.
Hardware Specification Yes Each individual evaluation of one algorithm on one objective function was run on one core of a c4.4xlarge Amazon EC2 instance.
Software Dependencies No To perform these experiments, we built a custom, extendable C++ framework that executes easily-repeatable evaluations of arbitrary black-box optimization algorithms. The source code is publicly available at https://github.com/Eiii/opt_cmp. Our framework includes C++ implementations of all partitioning-based algorithms (SOO, LOGO, Ba MSOO, IMGPO), while a modified2 version of the bayesopt library (Martinez-Cantin, 2014) provides the implementation of BO that we use in our evaluation.
Experiment Setup Yes To evaluate the algorithms in a true black-box setting, we use one set of hyperparameters for each algorithm across every objective function. To avoid manually tuning the algorithms with the goal of maximizing their performance on our specific problem set, when possible we use the same hyperparameter settings the authors of each method used during their evaluation process. We use the implementation of BO from the bayesopt (Martinez-Cantin, 2014) library. For our evaluations, we use the squared exponential kernel with automatic relevance detection in which the parameters are estimated using maximum total likelihood. Each ABO run is started with three randomly chosen initial points (which count against the algorithm’s budget), and the GP’s parameters are re-estimated from the observed data every second ABO iteration. The EI criteria is parameter-free, but UCB requires us to define a value for β. This value is frequently tuned per-objective, but we instead set it to βt = 2 log |D|t2π2/6δ , which Srinivas et al. (2010) found to be effective. In this case, t is the number of function observations made so far and δ is a constant set to 0.5. SOO has no parameters to set. LOGO only requires a list of integers to use as the adaptive schedule W for its local bias parameter. In this work, we set W = 3, 4, 5, 6, 8, 30 to duplicate the value chosen by Kawaguchi et al. (2016) in their work that introduced the LOGO algorithm. Ba MSOO and IMGPO both require the use of a GP prior to estimate the upper bound on the objective function’s value at certain nodes locations. Using the GP’s estimated mean function μ and standard deviation function σ, we define the LCB and UCB as μ(x) BNσ(x). For Ba MSOO we define BN = p 2 log (π2N2/6η) as suggested by Wang et al. (2014), and for IMGPO we define BN = p 2 log (π2N2/12η) as suggested by Kawaguchi et al. (2015). In both cases, N is the total number of times the GP was used to evaluate a node, and the constant η is set to 0.05 to duplicate the value used in the experimental results included by Kawaguchi et al. (2015) and Wang et al. (2014).