Global Continuous Optimization with Error Bound and Fast Convergence
Authors: Kenji Kawaguchi, Yu Maruyama, Xiaoyu Zheng
JAIR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The advantage and usage of the new algorithm are illustrated via theoretical analysis and an experiment conducted with 11 benchmark test functions. Further, we modify the LOGO algorithm to specifically solve a planning problem via policy search with continuous state/action space and long time horizon while maintaining its finite-time error bound. We apply the proposed planning method to accident management of a nuclear power plant. The result of the application study demonstrates the practical utility of our method. |
| Researcher Affiliation | Collaboration | Kenji Kawaguchi EMAIL Massachusetts Institute of Technology Cambridge, MA, USA Yu Maruyama EMAIL Nuclear Safety Research Center Japan Atomic Energy Agency Tokai, Japan |
| Pseudocode | Yes | The pseudocode for the LOGO algorithm is provided in Algorithm 1. The pseudocode for the LOGO-OP algorithm is provided in Algorithm 2. |
| Open Source Code | No | We implemented Ba MSOO by ourselves to use the empirical Bayes method, which was not done in the original implementation. The original implementation of Ba MSOO was not available for us as well. |
| Open Datasets | Yes | In the experiments, we compared the LOGO algorithm with its direct predecessor, the SOO algorithm (Munos, 2011) and its latest powerful variant, the Bayesian Multi-Scale Optimistic Optimization (Ba MSOO) algorithm (Wang, Shakibi, Jin, & de Freitas, 2014). ...The rest of the test functions are common benchmarks in global optimization literature; Surjanovic and Bingham present detailed information about the functions (2013). Retrieved July 2, 2014, from http://www.sfu.ca/~ssurjano. |
| Dataset Splits | No | The paper mentions rescaling domains to a hypercube and using a simulator with a single initial condition for the application study, but does not provide specific train/test/validation dataset splits for reproducibility. |
| Hardware Specification | No | The paper frequently refers to "CPU time" for performance measurement but does not provide any specific details about the hardware used (e.g., CPU model, GPU, memory, or computing cluster specifications). |
| Software Dependencies | No | For SA and GA, we used the same settings as those of the Matlab standard subroutines simulannealbnd and ga, except that we specified the domain bounds. The simulator that we adopt in this paper is THALES2 (Thermal Hydraulics and radionuclide behavior Analysis of Light water reactor to Estimate Source terms under severe accident conditions) (Ishikawa, Muramatsu, & Sakamoto, 2002). These mentions identify software tools but do not provide specific version numbers, which are required for reproducibility. |
| Experiment Setup | Yes | For the SOO and LOGO algorithms, we set hmax(n) = w n w. ...For the LOGO algorithm, we used a simple adaptive procedure to set the parameter w. Let ... W = {3, 4, 5, 6, 8, 30}. ...For the Ba MSOO algorithm, ... We selected the isotropic Matern kernel with ν = 5/2, ... The hyperparameters were initialized to σ = 1 and l = 0.25. We updated the hyperparameters every iteration until 1,000 function evaluations were executed and then per 1,000 iterations afterward... For SA and GA, we used the same settings as those of the Matlab standard subroutines simulannealbnd and ga... For the LOGO-OP algorithm and the p LOGO-OP algorithm, we blindly set L = 1000... We used only eight parallel workers for the p LOGO-OP algorithm... We consider x1 = [0, 3000] and x2 = [10810, 100490]. |