Latent Bayesian Optimization via Autoregressive Normalizing Flows

Authors: Seunghun Lee, Jinyoung Park, Jaewon Chu, Minseo Yoon, Hyunwoo Kim

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, our NF-BO method demonstrates superior performance in molecule generation tasks, significantly outperforming both traditional and recent LBO approaches. ... We validate our NF-BO across various benchmarks focusing on de novo molecular design tasks. Initially, we conduct experiments on the Guacamol benchmarks (Brown et al., 2019), specifically targeting seven challenging tasks where optimal solutions are not readily found. ... Subsequently, we evaluate our method on the PMO benchmarks (Gao et al., 2022), which consists of 23 tasks, including albuterol similarity, and amlodipine MPO.
Researcher Affiliation Academia 1Department of Computer Science and Engineering, Korea University 2School of Computing, KAIST EMAIL EMAIL
Pseudocode Yes For better understanding, the pseudocode for NF-BO is provided in the Appendix F. ... F PSEUDOCODE OF NF-BO This section provides the pseudocode of NF-BO frameworks on Algorithm 1.
Open Source Code Yes REPRODUCIBILITY STATEMENT For reproducibility, we elaborate on the overall pipeline of our work in Section 4. In our main paper and appendix, we also illustrate our overall pipeline and pseudocode for NF-BO, respectively. Code is available at https://github.com/mlvlab/NFBO.
Open Datasets Yes We validate our NF-BO across various benchmarks focusing on de novo molecular design tasks. Initially, we conduct experiments on the Guacamol benchmarks (Brown et al., 2019)... Subsequently, we evaluate our method on the PMO benchmarks (Gao et al., 2022)... For the Guacamol and PMO benchmarks, we pretrain using 1.27M unlabeled Guacamol and 250K ZINC datasets, respectively
Dataset Splits Yes For these benchmarks, we evaluate NF-BO and the baselines under three different settings, each varying the number of initial data points and the additional oracle budget: (100, 500), (10,000, 10,000), and (10,000, 70,000). ... We employ 1,000 initial data points and an additional 9,000 oracle calls following the PMO benchmarks.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper does not explicitly mention any specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow versions) used for the implementation.
Experiment Setup Yes We employ Thompson sampling (Eriksson et al., 2019) as the acquisition function, and our surrogate model is a sparse variational Gaussian process (Snelson & Ghahramani, 2005; Hensman et al., 2015; Matthews, 2017) enhanced with a deep kernel (Wilson et al., 2016). ... For (the batch size of trust regions, the number of query points Nq per trust region), we set these parameters to (5, 10) for the Guacamol benchmark with an additional oracle call setting of 500. For other settings, these parameters were adjusted to (10, 100). ... Table 3: Fixed parameters for all tasks and settings. Scaling factor κ in TACS: 0.1. Standard deviation σ of variational distribution q: 0.1. # of topk data for training: 1000. Coefficient of similarity loss Lsim: 1. ... This probability is defined as: ppxpiqq exppypiq{τ 1q ř j exppypjq{τ 1q where ypiq is the objective value of point i, and τ 1 is the temperature parameter set to 0.1...