reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Global Continuous Optimization with Error Bound and Fast Convergence

Authors: Kenji Kawaguchi, Yu Maruyama, Xiaoyu Zheng

JAIR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The advantage and usage of the new algorithm are illustrated via theoretical analysis and an experiment conducted with 11 benchmark test functions. Further, we modify the LOGO algorithm to speciﬁcally solve a planning problem via policy search with continuous state/action space and long time horizon while maintaining its ﬁnite-time error bound. We apply the proposed planning method to accident management of a nuclear power plant. The result of the application study demonstrates the practical utility of our method.
Researcher Affiliation	Collaboration	Kenji Kawaguchi EMAIL Massachusetts Institute of Technology Cambridge, MA, USA Yu Maruyama EMAIL Nuclear Safety Research Center Japan Atomic Energy Agency Tokai, Japan
Pseudocode	Yes	The pseudocode for the LOGO algorithm is provided in Algorithm 1. The pseudocode for the LOGO-OP algorithm is provided in Algorithm 2.
Open Source Code	No	We implemented Ba MSOO by ourselves to use the empirical Bayes method, which was not done in the original implementation. The original implementation of Ba MSOO was not available for us as well.
Open Datasets	Yes	In the experiments, we compared the LOGO algorithm with its direct predecessor, the SOO algorithm (Munos, 2011) and its latest powerful variant, the Bayesian Multi-Scale Optimistic Optimization (Ba MSOO) algorithm (Wang, Shakibi, Jin, & de Freitas, 2014). ...The rest of the test functions are common benchmarks in global optimization literature; Surjanovic and Bingham present detailed information about the functions (2013). Retrieved July 2, 2014, from http://www.sfu.ca/~ssurjano.
Dataset Splits	No	The paper mentions rescaling domains to a hypercube and using a simulator with a single initial condition for the application study, but does not provide specific train/test/validation dataset splits for reproducibility.
Hardware Specification	No	The paper frequently refers to "CPU time" for performance measurement but does not provide any specific details about the hardware used (e.g., CPU model, GPU, memory, or computing cluster specifications).
Software Dependencies	No	For SA and GA, we used the same settings as those of the Matlab standard subroutines simulannealbnd and ga, except that we speciﬁed the domain bounds. The simulator that we adopt in this paper is THALES2 (Thermal Hydraulics and radionuclide behavior Analysis of Light water reactor to Estimate Source terms under severe accident conditions) (Ishikawa, Muramatsu, & Sakamoto, 2002). These mentions identify software tools but do not provide specific version numbers, which are required for reproducibility.
Experiment Setup	Yes	For the SOO and LOGO algorithms, we set hmax(n) = w n w. ...For the LOGO algorithm, we used a simple adaptive procedure to set the parameter w. Let ... W = {3, 4, 5, 6, 8, 30}. ...For the Ba MSOO algorithm, ... We selected the isotropic Matern kernel with ν = 5/2, ... The hyperparameters were initialized to σ = 1 and l = 0.25. We updated the hyperparameters every iteration until 1,000 function evaluations were executed and then per 1,000 iterations afterward... For SA and GA, we used the same settings as those of the Matlab standard subroutines simulannealbnd and ga... For the LOGO-OP algorithm and the p LOGO-OP algorithm, we blindly set L = 1000... We used only eight parallel workers for the p LOGO-OP algorithm... We consider x1 = [0, 3000] and x2 = [10810, 100490].