BOSS: Bayesian Optimization over String Spaces
Authors: Henry Moss, David Leslie, Daniel Beck, Javier González, Paul Rayson
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now evaluate our proposed BO framework on tasks from a range of fields and syntactical constraints. Our code is available at github.com/henrymoss/BOSS and is built upon the Emukit Python package [Paleyes et al., 2019]. All results are based on runs across 15 random seeds, showing the mean and a single standard error of the best objective value found as we increase the optimization budget. |
| Researcher Affiliation | Collaboration | Henry B. Moss STOR-i Centre for Doctoral Training Lancaster University, UK EMAIL Daniel Beck Computing and Information Systems University of Melbourne, Australia EMAIL Javier González Microsoft Research Cambridge, UK David S. Leslie Dept. of Mathematics and Statistics Lancaster University, UK Paul Rayson School of Computing and Communications Lancaster University, UK |
| Pseudocode | No | The paper describes algorithms in text and through figures but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at github.com/henrymoss/BOSS and is built upon the Emukit Python package [Paleyes et al., 2019]. |
| Open Datasets | Yes | We replicate the symbolic regression example of Kusner et al. [2017], using their provided VAEs pre-trained for this exact problem. ...large collection of 250, 000 candidate molecules used by Kusner et al. [2017]... |
| Dataset Splits | No | The paper discusses training and testing for different models but does not provide explicit details on train/validation/test dataset splits (percentages or counts) for its own experiments. |
| Hardware Specification | Yes | Although acquisition function calculations could be parallelized across the populations of our GA at each BO step, we use a single-core Intel Xeon 2.30GHz processor to paint a clear picture of computational cost. |
| Software Dependencies | No | The paper mentions building upon the 'Emukit Python package' but does not provide specific version numbers for Emukit or Python, which are necessary for full reproducibility of software dependencies. |
| Experiment Setup | Yes | All results are based on runs across 15 random seeds, showing the mean and a single standard error of the best objective value found as we increase the optimization budget. ... After a random initialization of min(5, |Σ|) evaluations, kernel parameters are re-estimated to maximize model likelihood before each BO step. ... Our genetic algorithms (ga) limited to 100 evolutions of a population of size 100. |