Relation-Augmented Dueling Bayesian Optimization via Preference Propagation
Authors: Xiang Xia, Xiang Shu, Shuo Liu, Yiyi Zhu, Yijie Zhou, Weiye Wang, Bingdong Li, Hong Qian
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on both synthetic functions and real-world tasks such as motion control, car cab design and spacecraft trajectory optimization. The experimental results disclose the satisfactory accuracy of augmented preferences in RADBO, and show the superiority of RADBO compared with existing dueling optimization methods. |
| Researcher Affiliation | Collaboration | 1Shanghai Institute of AI Education, and School of Computer Science and Technology, East China Normal University, Shanghai 200062, China 2Game AI Center, Tencent Inc, Shenzhen 518057, China EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | The RADBO method is proposed, with pseudo-code shown in Algorithm 1. Algorithm 1 Relation-Augmented Dueling Bayesian Optimization (RADBO) |
| Open Source Code | Yes | The experimental code is publicly available at https://github.com/X-Xia0828/RADBO. |
| Open Datasets | Yes | To evaluate the performance of RADBO, experiments are first conducted on synthetic functions. In this paper, we construct objective functions for evaluation in a standard setting based on different synthetic functions1. ... The first dataset is Robot Push problem [Eriksson et al., 2019], which is a noisy 14-dimensional motion control problem... The second dataset is Sagas [Schlueter et al., 2021]... The third dataset is a 10-dimensional problem, Cassini1-MINLP [Schlueter and Munetomo, 2019]... The remaining two tasks are a 5-dimensional animation optimization problem (Animation) and a 7-dimensional car cab design problem (Carcab). |
| Dataset Splits | No | The paper does not provide specific training/test/validation dataset splits. It mentions initialization with '5 initial duels' or '30 solutions' and '95 iterations' or 'budget of 100', which refers to experimental setup and budget, not explicit data partitioning into splits. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments (e.g., GPU models, CPU models, memory details). |
| Software Dependencies | No | RADBO is implemented by Bo Torch [Balandat et al., 2020]. RADBO uses a Gaussian process with default parameters from the Bo Torch library as the surrogate model, employs CMA-ES [Hansen et al., 2003] as the optimizer of the acquisition function, and implements the Gaussian mixture model using the default parameters from the scikit-learn library. Although software libraries are mentioned (Bo Torch, scikit-learn), specific version numbers for these dependencies are not provided. |
| Experiment Setup | Yes | I = 500 samples are employed to estimate the integral of the soft-Copeland score, and the GP model is initialized using M = 5 duels, followed by N = 95 duels for optimization. For RADBO, we use k = 3 to execute the preference propagation technique. ... All methods are evaluated with 5 initial duels, 95 iterations, and each experiment is repeated 20 times. ... Each method is allocated a budget of 100, with function value based methods initialized with 30 solutions and preference based methods with 30 duels. |