Materials Discovery using Max K-Armed Bandit
Authors: Nobuaki Kikkawa, Hiroshi Ohno
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We applied the proposed algorithm to synthetic problems and molecular-design demonstrations using a Monte Carlo tree search. According to the results, the proposed algorithm stably outperformed other bandit algorithms in the late stage of the search process, unless the optimal arm coincides in the MKB and conventional bandit settings. We conduct two types of numerical experiments to compare our algorithms with other algorithms. One is the synthetic bandit problems with the Gaussian reward distributions, and the other is SMILES optimization using MCTS (Yang et al., 2017; Kajita et al., 2020; Kikkawa et al., 2020) as the demonstrations for materials discovery. We employed a single set of recommended or reasonable hyperparameters for all the experiments because the tuning of hyperparameters for the actual applications in materials discovery is extremely expensive. |
| Researcher Affiliation | Industry | Nobuaki Kikkawa EMAIL Toyota Central R&D Labs., Inc. 41-1, Yokomichi, Nagakute, Aichi 480-1192, Japan. Hiroshi Ohno EMAIL Toyota Central R&D Labs., Inc. 41-1, Yokomichi, Nagakute, Aichi 480-1192, Japan. |
| Pseudocode | Yes | Algorithm 1 Max Search Input: number of arms K, current time τ, and previous records R(τ 1). Output: selected arm index ˆk. ... Algorithm 2 Selection index under Gaussian settings ... Algorithm 3 Selection index under sub-gaussian settings ... Algorithm 4 MCTS ... |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. It mentions using 'python thermo module' and 'RDKit library', which are third-party tools, but not their own implementation code. |
| Open Datasets | No | The paper does not provide concrete access information for a publicly available or open dataset. It describes generating molecular structures using context-free grammar and empirical equations for |
| Dataset Splits | No | The paper does not provide specific dataset split information. The experiments are based on 'synthetic bandit problems' and 'molecular-design demonstrations' where candidate molecular structures are generated on-the-fly using context-free grammar, implying no predefined splits are used or provided. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. It does not mention any GPU/CPU models, processor types, or memory amounts. |
| Software Dependencies | No | The paper mentions 'python thermo module (Bell and Contributors, 2016)' and 'RDKit library (Landrum, 2016)' but does not provide specific version numbers for these software dependencies, which are necessary for full reproducibility. |
| Experiment Setup | Yes | We set T = 10,000 considering the realistic applications (Kajita et al., 2020; Kikkawa et al., 2020) unless the observed maximum reward clearly does not converge. We present the details of other algorithms in Appendix B. ... In the present experiments, we set c = 1. ... In Threshold Ascent, the hyperparameters were set to s = 100 and δ = 2 ln ν. ... In Robust UCBMax, we set s = 100, u = rs-th, v = (rmax u)1+ϵ/ ν, and ϵ = 0.4, according to the original paper (Achab et al., 2017). ... In sp UCB, c = 0.1 and D = 32 are used as the hyperparameters. ... In UCBE (Audibert et al., 2010) and the conventional UCB (Auer et al., 2002), we used c = 1 as the hyperparameter. |