A Portfolio Approach to Massively Parallel Bayesian Optimization

Authors: Mickael Binois, Nicholson Collier, Jonathan Ozik

JAIR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare the approach with related methods on noisy functions, for mono and multi-objective optimization tasks. These experiments show orders of magnitude speed improvements over existing methods with similar or better performance.
Researcher Affiliation Academia Mickaël Binois EMAIL Inria, Université Côte d Azur, CNRS, LJAD Sophia Antipolis, France; Nicholson Collier EMAIL Jonathan Ozik EMAIL Argonne National Laboratory, Lemont, IL, USA Consortium for Advanced Science and Engineering, University of Chicago Chicago, IL, USA
Pseudocode Yes Algorithm 1 Pseudo-code for batch BO; Algorithm 2 Pseudo-code for batch BO with q HSRI
Open Source Code Yes The R code (R Core Team, 2023) of the approach is available as supplementary material.
Open Datasets Yes We consider the training of a convolutional neural network (CNN) used for the classification of digits based on the MNIST data (Le Cun et al., 1998); The R code (R Core Team, 2023) of the approach and the City COVID data are available as supplementary material.
Dataset Splits Yes CNN used for the classification of digits based on the MNIST data (Le Cun et al., 1998), with 70,000 handwritten digits (including 10,000 for testing).
Hardware Specification Yes Results have been obtained in parallel on dual-Xeon Skylake SP Silver 4114 @ 2.20GHz (20 cores) and 192 GB RAM (or similar nodes). Lunar lander tests have been run on a laptop. The CNN training is performed on Ge Force GTX 1080 Ti GPUs.
Software Dependencies Yes The R package het GP (Binois & Gramacy, 2021) is used for noisy GP modeling. Dice Optim (Picheny et al., 2021), or the approximated one from Binois (2015), q AEI. pso (Bendtsen, 2012) (population of size 200) is conducted too. NSGA-II (Deb et al., 2002) from mco (Mersmann, 2020) is used to find P. The R package Dice Kriging (Roustant et al., 2012) is used for deterministic GP modeling.
Experiment Setup Yes The six variables of the CNN are given in Table 3. The architecture is composed of two 2D convolutional + max pooling layers, before two dense layers with dropout. The reference point used for hypervolume computations is [0, 150]. The CNN training is performed on Ge Force GTX 1080 Ti GPUs. The nine variables of the simulator are given in Table 4.