reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GIBBON: General-purpose Information-Based Bayesian Optimisation

Authors: Henry B. Moss, David S. Leslie, Javier Gonzalez, Paul Rayson

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyse GIBBON across a suite of synthetic benchmark tasks, a molecular search loop, and as part of a challenging batch multi-ﬁdelity framework for problems with controllable experimental noise. Keywords: Bayesian optimisation, entropy search, experimental design, multi-ﬁdelity, batch.
Researcher Affiliation	Collaboration	Henry B. Moss EMAIL Secondmind.ai Cambridge, UK David S. Leslie EMAIL Lancaster University Lancaster, UK Javier Gonz alez EMAIL Microsoft Research Cambridge, UK Paul Rayson EMAIL Lancaster University Lancaster, UK
Pseudocode	Yes	Algorithm 1: GIBBON for general-purpose BO tasks. Input: Resource budget R, Batch size B, Gumbel sample size N 1 Initialise n 0 and spent resource counter r 0 2 Propose initial design I 3 while r R do 4 Begin new iteration n n + 1 5 Fit GP model to collected evaluations Dn 6 Simulate N samples from g \|Dn 7 Compute αGIBBON n as given by Deﬁnition 4 8 Find B locations {zi}B i=1 maximising αGIBBON n ({zi}B i=1) c({zi}B i=1) 9 Evaluate new locations and collect evaluations Dn+1 Dn S{(zi, yzi)}B i=1 10 Update spent budget r r + c({zi}B i=1) Output: Believed maximiser argmaxx Dn g(x)
Open Source Code	Yes	Implementations of GIBBON are available in three popular Python libraries for BO: Emukit (Paleyes et al., 2019), Bo Torch (Balandat et al., 2020) and Trieste (Berkeley et al., 2021) .
Open Datasets	Yes	In particular, we recreate two of the experiments of Balandat et al. (2020) by maximising the Hartmann (d = 6) and Ackley functions (d = 4), each with observations perturbed by centred Gaussian noise with a variance of 0.25. In addition, we also consider the Shekel function (d = 4) under exact observations. ... We now recreate the Zinc example (also considered by Kusner et al. (2017) and Grifﬁths and Hern andez-Lobato (2020)), where we seek to explore a large collection of 250,000 molecules. ... tuning a sentiment classiﬁcation model on the collection of 25, 000 positive and 25, 000 negative IMDB movie reviews used by Maas et al. (2011).
Dataset Splits	Yes	We tune the ﬂexibility of the decision boundary (C) and the RBF kernel coefﬁcient (gamma) for an SVM... with test sets of 10%. ... for 5-fold and 10-fold cross-validation (Figures 11b and 11c).
Hardware Specification	Yes	All experiments reporting optimisation overheads were performed on a quad core Intel Xeon 2.30GHz processor.
Software Dependencies	No	Implementations of GIBBON are available in three popular Python libraries for BO: Emukit (Paleyes et al., 2019), Bo Torch (Balandat et al., 2020) and Trieste (Berkeley et al., 2021). No specific version numbers were provided for these libraries.
Experiment Setup	Yes	Following the setup of Balandat et al. (2020), we initialise all routines by evaluating 2d + 2 random locations, reﬁt our GP s kernel parameters after each BO step, and choose the current believed optimum x by maximising the posterior mean of the GP surrogate model. ... All MES-based acquisition functions (including GIBBON) use 5 max-values sampled from a Gumbel distribution ﬁt to surrogate model predictions at 10, 000 d random locations and are re-sampled for each BO step. All other implementation parameters follow the Bo Torch defaults. ... for all approaches (including our GIBBON acquisition function) except KG , batches are constructed in this greedy manner with a maximisation budget of 10 d random restarts for each element of the batch. Although KG is able to jointly allocate batches, its large computational cost restricted us to 20 restarts (the amount recommended by the Bo Torch authors).