reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BARK: A Fully Bayesian Tree Kernel for Black-box Optimization

Authors: Toby Boyne, Jose Pablo Folch, Robert Matthew Lee, Behrang Shafei, Ruth Misener

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show the strong performance of BARK on both synthetic and applied benchmarks, due to the combination of our fully Bayesian surrogate and the optimization procedure.
Researcher Affiliation	Collaboration	1Imperial College London (London, UK) 2BASF SE (Ludwigshafen, Germany). Correspondence to: Toby Boyne <EMAIL>.
Pseudocode	No	The paper describes algorithms and formulations but does not present them in a clearly labeled pseudocode or algorithm block format. For example, Section 5.3 describes computational considerations and Section J.2 details the optimization formulation, but these are prose or mathematical equations rather than structured pseudocode.
Open Source Code	Yes	The code to run these experiments is available at https://github.com/TobyBoyne/bark.
Open Datasets	Yes	We use real datasets from the UCI Repository (Cortez & Silva, 2008; Nash et al., 1994; Quinlan, 1993; Yeh, 1998; Dua & Graff, 2017). ... Tree Function and Tree Function Cat are functions sampled from the BART prior... Discrete Ackley and Discrete Rosenbrock are partially discretized functions from Dreczkowski et al. (2023); Bliek et al. (2021). ...We use the hyperparameter optimization benchmarks SVRBench and XGBoost MNIST (Dreczkowski et al., 2023). We further evaluate using CCOBench (Dreifuerst et al., 2021), which optimizes the configuration of antennas to maximize network coverage, and Pest Control (Dreczkowski et al., 2023)... Hartmann and Styblinksi-Tang problems from the SFU library (Surjanovic & Bingham, 2013).
Dataset Splits	No	The paper mentions
Hardware Specification	Yes	The experiments were run on a High Performance Computing cluster, equipped with AMD EPYC 7742 processors, with each core allocated 16GB of RAM. For models capable of multithreading, we use 8 CPUs; otherwise, only 1.
Software Dependencies	Yes	We use the Py MC-BART v0.8.2 implementation (Quiroga et al., 2022). ... We use the Bo Fire v0.0.16 implementation (Dürholt et al., 2024), which in turn uses Bo Torch v0.11.3 (Balandat et al., 2020). ... We use the SMAC3 v.2.0 implementation (Lindauer et al., 2022)... We use Gurobi 11 to optimize the acquisition function.
Experiment Setup	Yes	For both model fitting and BO, we use 16 samples for the BARK kernel. ... BARK uses 1000 burn-in samples and 400 kernel samples, running in parallel with 4 chains. We use a thinning rate of 100 to obtain the 16 samples. ... For BARK (as well as BART), we use the default number of trees m = 50, as suggested by Kapelner & Bleich (2016), and the default parametrization of the noise, (ν, q) = (3, 0.9). ... For model fitting and BO, BART takes 1000 samples to burn-in the MCMC chain, and 1000 posterior samples. This is run in parallel for 4 chains. ... The MIP gap to 10%... Time Limit Maximum time to find the optimum point 100s