reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization

Authors: Zhitong Xu, Haitao Wang, Jeff Phillips, Shandian Zhe

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	First, through a comprehensive evaluation across twelve benchmarks, we found that while the popular Square Exponential (SE) kernel often leads to poor performance, using Matérn kernels enables standard BO to consistently achieve top-tier results, frequently surpassing methods specifically designed for high-dimensional optimization. Second, our theoretical analysis reveals that the SE kernel s failure primarily stems from improper initialization of the length-scale parameters, which are commonly used in practice but can cause gradient vanishing in training. ...Empirical Results. We investigated BO with standard GP across eleven widely used benchmarks and one novel benchmark, encompassing six synthetic and six real-world high-dimensional optimization tasks. The number of variables ranged from 30 to 1,003. We compared standard BO with nine state-of-the-art high-dimensional BO methods, and performed extensive evaluations.
Researcher Affiliation	Academia	Zhitong Xu, Haitao Wang, Jeff M. Phillips, Shandian Zhe Kahlert School of Computing University of Utah, Salt Lake City, UT 84112, USA EMAIL, EMAIL
Pseudocode	No	The paper describes methods and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code is released at https://github.com/XZT008/Standard-GP-is-all-you-need-for-HDBO.
Open Datasets	Yes	We investigated BO with standard GP across eleven widely used benchmarks and one novel benchmark, encompassing six synthetic and six real-world high-dimensional optimization tasks. ...Synthetic Benchmarks. We considered four popular synthetic functions: Ackley, Rosenbrock, Hartmann6, and Stybtang. ...Real-World Benchmarks. We employed the following real-world benchmark problems: Mopta08 (124) (Jones, 2008), ... SVM (388) (Eriksson & Jankowiak, 2021), ... Rover (60) (Wang et al., 2018). ... DNA (180) (v Sehi c et al., 2022), ... NAS201 (30) (Dong & Yang, 2020), ... Humanoid Standup (1003): A novel trajectory optimization benchmark based on a humanoid simulator that uses the Mu Jo Co physics engine (Todorov et al., 2012).
Dataset Splits	Yes	For each value of d, we uniformly sampled 500 training inputs and 100 test inputs. ... For each optimization task, we randomly queried the target function 20 times to collect the initial data, except for Humanoid-Standup, where we collected 50 initial data points.
Hardware Specification	Yes	We conducted the experimental investigations on a large computer cluster equipped with Intel Cascade Lake Platinum 8268 chips.
Software Dependencies	Yes	ALEBO. We used ALEBO implementation shared by the Adaptive Experimentation (AX) Platform (version 0.2.2). The source code is at https://github.com/facebook/Ax/blob/main/ax/models/torch/alebo.py.
Experiment Setup	Yes	We used UCB as the acquisition function where the exploration level λ was set to 1.5. ... For Adam and RMSProp, we set the initial learning rate to 0.1 and 0.01, respectively. The maximum number of epochs was set to 1,500 for both methods. ... To achieve numerical stability, we employed the prior Uniform(0.001, 30) over each length-scale, a diffused Gamma prior over the noise variance Gamma(1.1, 0.05), and the prior Gamma(2.0, 0.15) over the amplitude. The maximum number of iterations was set to 15K and tolerance level 2E-9.