Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
Authors: Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer
JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this survey, we seek to unify the field of Auto RL, provide a common taxonomy, discuss each area in detail and pose open problems of interest to researchers going forward. |
| Researcher Affiliation | Collaboration | Jack Parker-Holder EMAIL University of Oxford Raghu Rajan EMAIL University of Freiburg Xingyou Song EMAIL Google Research, Brain Team André Biedenkapp EMAIL University of Freiburg Yingjie Miao EMAIL Google Research, Brain Team Theresa Eimer EMAIL Leibniz University Hannover Baohe Zhang EMAIL University of Freiburg Vu Nguyen EMAIL Amazon Australia Roberto Calandra EMAIL Meta AI Aleksandra Faust EMAIL Google Research, Brain Team Frank Hutter EMAIL University of Freiburg & Bosch Center for Artificial Intelligence Marius Lindauer EMAIL Leibniz University Hannover |
| Pseudocode | No | The paper describes various Auto RL methods and algorithms conceptually, and uses diagrams like Figure 1 and Figure 2 to illustrate components and loops, but it does not contain any structured pseudocode or algorithm blocks with step-by-step instructions. |
| Open Source Code | No | The paper is a survey and does not present original methodology requiring code release. It mentions third-party libraries like Jax, TensorFlow, and PyTorch in the context of autodifferentation, but does not claim to release its own code for the survey itself. |
| Open Datasets | Yes | The paper mentions and cites several well-known public benchmarks and environments used in Reinforcement Learning research, including "Open AI Gym (Brockman et al., 2016)", "Arcade Learning Environment (Bellemare et al., 2012)", "Open AI Procgen (Cobbe et al., 2020)", "Coin Run (Cobbe et al., 2019a)", "Mini Grid (Chevalier-Boisvert et al., 2018)", "Net Hack (Küttler et al., 2020)", "Mine RL (Guss et al., 2019)", and "Meta-World (Yu et al., 2019)". |
| Dataset Splits | No | The paper is a survey discussing concepts related to training and validation rewards and references how other works use dataset distributions. For instance, it states: "f(ζ, θ ) can be defined as the validation reward, i.e. the reward in the outer loop, whereas J(θ ; ζ) can be considered the training reward, i.e. the reward in the inner loop." However, it does not specify any particular dataset splits for its own experimental results because it is a survey and does not conduct original experiments. |
| Hardware Specification | No | The paper is a survey and does not describe experiments conducted by the authors. While it mentions resource requirements for certain methods (e.g., "thousands of CPU cores" for evolutionary approaches or "massively parallel simulation with a single GPU" in the context of the Brax physics engine for future work), it does not specify any particular hardware used by the authors for their own work. |
| Software Dependencies | No | The paper mentions popular machine learning frameworks such as "Jax (Bradbury et al., 2018), Tensorflow (Abadi et al., 2015), Pytorch (Paszke et al., 2019)" in Section 4.6. However, these are mentioned as examples of readily available autodifferentiation libraries in the context of gradient-based meta-learning, not as specific versioned software dependencies for the paper's own methodology. |
| Experiment Setup | No | The paper is a survey that discusses the importance of various hyperparameters (e.g., discount factor γ, batch size B) and methods for their optimization in Auto RL. For example, Section 3.4 is titled "Last but not Least: What about Hyperparameters?". However, as a survey, it does not present specific experimental results or provide concrete hyperparameter values or training configurations used in its own empirical work. |