reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High-Dimensional Bayesian Optimisation with Gaussian Process Prior Variational Autoencoders

Authors: Siddharth Ramchandran, Manuel Haussmann, Harri Lähdesmäki

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that our method improves upon existing latent space BO methods on simulated datasets as well as on commonly used benchmarks. ... We demonstrate the efficacy of our method described in Algorithm 1 on simulated datasets as well as on a molecular discovery benchmark dataset.
Researcher Affiliation	Academia	Siddharth Ramchandran Department of Computer Science Aalto University Espoo, Finland Manuel Haussmann Department of Mathematics and Computer Science University of Southern Denmark Odense, Denmark Harri L ahdesm aki Department of Computer Science Aalto University Espoo, Finland
Pseudocode	Yes	Algorithm 1: An overview of our proposed algorithm
Open Source Code	Yes	The source code is available at https://github.com/Sid Rama/GP-prior-VAE-BO.
Open Datasets	Yes	We demonstrate our model s ability to perform effective high-dimensional BO by modifying digits from the MNIST dataset. We consider the common task of generating single-variable mathematical expressions from a formal grammar (Kusner et al., 2017; Tripp et al., 2020; Grosnit et al., 2021; Maus et al., 2022). ... We followed the data preparation proposed by Grosnit et al. (2021) to obtain 40000 data points. We use the ZINC-250K molecular dataset used in G omez-Bombarelli et al. (2018), which consists of 250000 drug-like commercially available molecules extracted from the ZINC database (Irwin et al., 2012).
Dataset Splits	Yes	In all our experiments, 10% of the training data is used as a held-out validation set for early-stopping to ensure that the generative model does not overfit. ... Specifically, we use 80% of the encoded latent points (from the respective training splits) to train a sparse GP with 500 inducing points and compute the predictive log-likelihood on the remaining 20% of held-out data.
Hardware Specification	Yes	Table 3: Average run time / wall clock time. Synthetic data (5000 obs.) Kernel Θ1 AMD MI250x AMD EPYC Trento Expression reconstruction Nvidia Tesla V100 Intel Xeon Gold 6134 Molecular discovery Kernel Θ1 Nvidia Tesla V100 Intel Xeon Gold 6134
Software Dependencies	No	The paper mentions software tools like PyTorch and BoTorch but does not provide specific version numbers for these dependencies within the text. It cites the papers where these tools were introduced, but explicit version numbers are not stated.
Experiment Setup	Yes	We set the number of latent dimensions to 8 for the synthetic dataset experiment, 25 for the expression reconstruction experiment, and 56 for the molecular discovery experiment. ... We set the frequency of retraining ν = 10 and the stopping criterion η = 0.1. ... We set ξ to be 0.01 in all our experiments (a recommended default value). ... In the synthetic data experiment, we use the neural network architecture described in Table 1.