reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Authors: Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer

JAIR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this survey, we seek to unify the ﬁeld of Auto RL, provide a common taxonomy, discuss each area in detail and pose open problems of interest to researchers going forward.
Researcher Affiliation	Collaboration	Jack Parker-Holder EMAIL University of Oxford Raghu Rajan EMAIL University of Freiburg Xingyou Song EMAIL Google Research, Brain Team André Biedenkapp EMAIL University of Freiburg Yingjie Miao EMAIL Google Research, Brain Team Theresa Eimer EMAIL Leibniz University Hannover Baohe Zhang EMAIL University of Freiburg Vu Nguyen EMAIL Amazon Australia Roberto Calandra EMAIL Meta AI Aleksandra Faust EMAIL Google Research, Brain Team Frank Hutter EMAIL University of Freiburg & Bosch Center for Artiﬁcial Intelligence Marius Lindauer EMAIL Leibniz University Hannover
Pseudocode	No	The paper describes various Auto RL methods and algorithms conceptually, and uses diagrams like Figure 1 and Figure 2 to illustrate components and loops, but it does not contain any structured pseudocode or algorithm blocks with step-by-step instructions.
Open Source Code	No	The paper is a survey and does not present original methodology requiring code release. It mentions third-party libraries like Jax, TensorFlow, and PyTorch in the context of autodiﬀerentation, but does not claim to release its own code for the survey itself.
Open Datasets	Yes	The paper mentions and cites several well-known public benchmarks and environments used in Reinforcement Learning research, including "Open AI Gym (Brockman et al., 2016)", "Arcade Learning Environment (Bellemare et al., 2012)", "Open AI Procgen (Cobbe et al., 2020)", "Coin Run (Cobbe et al., 2019a)", "Mini Grid (Chevalier-Boisvert et al., 2018)", "Net Hack (Küttler et al., 2020)", "Mine RL (Guss et al., 2019)", and "Meta-World (Yu et al., 2019)".
Dataset Splits	No	The paper is a survey discussing concepts related to training and validation rewards and references how other works use dataset distributions. For instance, it states: "f(ζ, θ ) can be deﬁned as the validation reward, i.e. the reward in the outer loop, whereas J(θ ; ζ) can be considered the training reward, i.e. the reward in the inner loop." However, it does not specify any particular dataset splits for its own experimental results because it is a survey and does not conduct original experiments.
Hardware Specification	No	The paper is a survey and does not describe experiments conducted by the authors. While it mentions resource requirements for certain methods (e.g., "thousands of CPU cores" for evolutionary approaches or "massively parallel simulation with a single GPU" in the context of the Brax physics engine for future work), it does not specify any particular hardware used by the authors for their own work.
Software Dependencies	No	The paper mentions popular machine learning frameworks such as "Jax (Bradbury et al., 2018), Tensorﬂow (Abadi et al., 2015), Pytorch (Paszke et al., 2019)" in Section 4.6. However, these are mentioned as examples of readily available autodiﬀerentiation libraries in the context of gradient-based meta-learning, not as specific versioned software dependencies for the paper's own methodology.
Experiment Setup	No	The paper is a survey that discusses the importance of various hyperparameters (e.g., discount factor γ, batch size B) and methods for their optimization in Auto RL. For example, Section 3.4 is titled "Last but not Least: What about Hyperparameters?". However, as a survey, it does not present specific experimental results or provide concrete hyperparameter values or training configurations used in its own empirical work.