Deep Learning for Bayesian Optimization of Scientific Problems with High-Dimensional Structure

Authors: Samuel Kim, Peter Y Lu, Charlotte Loh, Jamie Smith, Jasper Snoek, Marin Soljacic

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate BO on a number of realistic problems in physics and chemistry, including topology optimization of photonic crystal materials using convolutional neural networks, and chemical property optimization of molecules using graph neural networks. On these complex tasks, we show that neural networks often outperform GPs as surrogate models for BO in terms of both sampling efficiency and computational cost.
Researcher Affiliation Collaboration Samuel Kim EMAIL Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Peter Y. Lu Department of Physics Massachusetts Institute of Technology Charlotte Loh Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Jamie Smith Google Research Jasper Snoek Google Research Marin Soljačić EMAIL Department of Physics Massachusetts Institute of Technology
Pseudocode Yes Algorithm 1 Bayesian optimization with auxiliary information 1: Input: Labelled dataset Dtrain = {(xn, zn, yn)}Nstart=5 n=1 2: for N = 5 to 1000 do 3: Train M: X Z on Dtrain 4: Form an unlabelled dataset, Xpool 5: Find x N+1 = arg maxx Xpool α (x; M, Dtrain) 6: Label the data z N+1 = g(x N+1), y N+1 = h(z N+1) 7: Dtrain = Dtrain (x N+1, z N+1, y N+1) 8: end for
Open Source Code Yes We make our datasets and code publicly available at https://github.com/samuelkim314/Deep BO
Open Datasets Yes Here we focus on the QM9 dataset (Ruddigkeit et al., 2012; Ramakrishnan et al., 2014), which consists of 133,885 small organic molecules along with their geometric, electronic, and thermodynamics quantities that have been calculated with DFT.
Dataset Splits No The paper uses an iterative Bayesian optimization process where the dataset Dtrain is incrementally built: "Dtrain = Dtrain (x N+1, z N+1, y N+1)". For validation, Appendix A.5.3 states: "we track various metrics of the model during BO on a validation dataset with 1000 randomly sampled data points." However, this does not describe the specific splits for the main training datasets (nanoparticle, photonic crystal, or the initial QM9 pool) or how they are partitioned for reproducibility.
Hardware Specification Yes All experiments were carried out on systems with NVIDIA Volta V100 GPUs and Intel Xeon Gold 6248 CPUs.
Software Dependencies No The paper mentions several software components like "Tensor Flow v1", "Adam optimizer", "GPy Opt library", "dlib library", "NLopt library", and "pycma library". It also mentions "Neural Tangents library". However, specific version numbers are not consistently provided for all these tools or the programming language (e.g., Python) itself.
Experiment Setup Yes Unless otherwise stated, we set NMC = 30. All BNNs other than the infinitely-wide networks are implemented in Tensor Flow v1. Models are trained using the Adam optimizer using the cosine annealing learning rate with a base learning rate of 10 3 (Loshchilov & Hutter, 2016). All hidden layers use Re LU as the activation function, and no activation function is applied to the output layer. In particular, we re-train the BNN using 10 epochs.