reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Regret Analysis of Posterior Sampling-Based Expected Improvement for Bayesian Optimization

Authors: Shion Takeno, Yu Inatsu, Masayuki Karasuyama, Ichiro Takeuchi

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we demonstrate the effectiveness of the proposed method through numerical experiments.
Researcher Affiliation	Academia	Shion Takeno EMAIL Department of Mechanical Systems Engineering, Nagoya University RIKEN Center for Advanced Intelligence Project Yu Inatsu EMAIL Department of Computer Science, Nagoya Institute of Technology Masayuki Karasuyama EMAIL Department of Computer Science, Nagoya Institute of Technology Ichiro Takeuchi EMAIL Department of Mechanical Systems Engineering, Nagoya University RIKEN Center for Advanced Intelligence Project
Pseudocode	Yes	Algorithm 1 GP-EIMS Require: Input space X, GP prior µ = 0 and k, and initial dataset D0 1: for t = 1, . . . do 2: Fit GP to Dt 1 3: Generate a sample path gt p(f\|Dt 1) 4: g t maxx X gt(x) 5: xt arg maxx X EI(µt 1(x), σt 1(x), g t ) 6: Observe yt = f(xt) + ϵt and Dt Dt 1 (xt, yt) 7: end for
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is released or provide a link to a code repository. It only mentions using existing approximations like "random Fourier feature approximations".
Open Datasets	Yes	We performed numerical experiments using synthetic functions generated from GPs, which match the assumptions for our analysis. ... Figures 1 and 2 show the simple regret and cumulative regret for the SE kernels. ... We performed the experiments for the Matérn kernels ... and several benchmark functions from https://www.sfu.ca/~ssurjano/optimization.html.
Dataset Splits	No	The paper states, "We set X = {0.0, 0.1, . . . , 0.9}d and d = 4. Therefore, \|X\| = 104." and "We set an initial dataset to data that is closest to 2d data generated randomly as a Sobol sequence." It does not explicitly mention training/test/validation splits or reference standard splits with citations for reproducibility.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions "random Fourier feature approximations" as a technique used but does not specify any software libraries, frameworks, or their version numbers. For example, it does not mention Python, PyTorch, TensorFlow, or specific library versions.
Experiment Setup	Yes	We set X = {0.0, 0.1, . . . , 0.9}d and d = 4. ... We employed the SE kernel k SE(x, x ) = exp x x 2 2/(2ℓ2) . ... We report results changing ℓ {0.1, 0.2} and σ {0.01, 0.1, 1}. We fixed the hyperparameters of the GP model, that is ℓand σ, to the parameters used to generate the synthetic functions and observation noise. ... We set the hyperparameters of GP-UCB, IRGP-UCB, and GP-EI-µmax to the theoretically derived parameters for the BCR analyses. Therefore, we set βt = 2 log(\|X\|t2/ 2π +1) for GP-UCB and ζt = 2 log(\|X\|/2)+Zt with Zt Exp(λ = 1/2) for IRGP-UCB (Takeno et al., 2023). Furthermore, we set νt = 2 log(\|X\|t2/ 2π + 1) for GP-EI-µmax (see Theorem B.4). The number of MC samples in MES, JES, and E3I was set to 10.