Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds
Authors: Aya Kayal, Sattar Vakili, Laura Toni, Da-Shan Shiu, Alberto Bernacchia
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results on the performance of MRLPF on synthetic functions that closely align with the analytical assumptions, as well as on a dataset of Yelp reviews, demonstrating the utility of the proposed algorithm in realworld applications (Section 5). |
| Researcher Affiliation | Collaboration | Aya Kayal s work was part of her research placement at Media Tek Research. 1University College London, UK 2Media Tek Research. Correspondence to: Aya Kayal <EMAIL>, Sattar Vakili <EMAIL>. |
| Pseudocode | Yes | A pseudocode is provided in Algorithm 1. |
| Open Source Code | Yes | Our implementation is publicly available.3https://github.com/ayakayal/BOHF_code_ submission |
| Open Datasets | Yes | To showcase the utility of our approach in real-world applications, we experimented using the Yelp Open Dataset4 of restaurant reviews. |
| Dataset Splits | No | The paper describes the processing of the Yelp Dataset, including concatenating reviews, generating vector embeddings, scaling user ratings, and handling missing ratings using collaborative filtering. It also mentions sampling a random user for each experimental run. However, it does not explicitly provide details about training, validation, or test splits (e.g., percentages or sample counts) for the datasets used in the experiments. |
| Hardware Specification | Yes | The code is executed on a cluster with 376.2 Gi B of RAM and an Intel(R) Xeon(R) Gold 5118 CPU running at 2.30 GHz. In the case of the Yelp Dataset experiments, ... The simulations are carried out on a computing node equipped with an NVIDIA Ge Force RTX 2080 Ti GPU featuring 11 GB of VRAM, an Intel(R) Xeon(R) Gold 5118 CPU running at 2.40 GHz with 24 cores, and 92 GB of RAM. |
| Software Dependencies | No | For the experiments with the synthetic RKHS and Ackley functions, we utilize the Scikit Learn library (Pedregosa et al., 2011) for implementing Gaussian Process (GP) regression. ... we use the Bo Torch library (Balandat et al., 2020) and its dependencies, including GPy Torch (Gardner et al., 2018), which offer efficient GP regression tools with GPU support. ... Open AI s TEXT-EMBEDDING-3-LARGE model... While software libraries like Scikit Learn, Bo Torch, and GPy Torch, and the OpenAI model are mentioned, specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | We choose l = 0.1 as the length scale and λ = 0.05 as the kernel-based learning parameter across all cases. The horizon T is set to 300 for RKHS test functions and 2000 for the Ackley function and the Yelp Dataset. For the RKHS and Ackley functions, the confidence interval width β is fixed at 1 for both MR-LPF and Max Min LCB. For the Yelp dataset, we conduct a grid search to tune β over {0.01, 0.1, 0.5, 1, 2} for both MR-LPF and Max Min LCB algorithms. We determine β = 2 as optimal for Max Min LCB and β = 0.1 for MR-LPF. |