reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Rollout Total Correlation for Deep Reinforcement Learning

Authors: Bang You, Huaping Liu, Jan Peters, Oleg Arenz

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental evaluations on a set of challenging image-based simulated control tasks show that our method achieves better sample efficiency, and robustness to both white noise and natural video backgrounds compared to leading baselines.
Researcher Affiliation	Academia	Bang You EMAIL School of Information Engineering Wuhan University of Technology Huaping Liu EMAIL Department of Computer Science and Technology Tsinghua University Jan Peters EMAIL Intelligent Autonomous Systems Technische Universität Darmstadt German Research Center for AI (DFKI) Hessian Centre for Artificial Intelligence (Hessian.AI) Centre for Cognitive Science (Cog Sci) Oleg Arenz EMAIL Intelligent Autonomous Systems Technische Universität Darmstadt
Pseudocode	Yes	B.10 Algorithm The training procedure of MTC is presented in Algorithm 1. Algorithm 1: Training Algorithm for ROTOC
Open Source Code	No	The paper does not provide an explicit statement or link to the source code for the ROTOC methodology described. It mentions using a "publicly released standard Pytorch implementation (Yarats et al., 2021b) of SAC" for baselines, but not for their own work.
Open Datasets	Yes	We evaluate the ROTOC on a set of challenging standard Mujoco tasks from the Deepmind control suite (Tassa et al., 2018)...the background of the Mujoco tasks is replaced by natural videos (Zhang et al., 2020) sampled from the Kinetics dataset (Kay et al., 2017).
Dataset Splits	No	The paper describes using tasks from the Deepmind control suite in standard, noisy, and natural video settings for training and evaluation. It does not specify explicit train/test/validation splits for a static dataset in the traditional sense, nor does it detail how the Kinetics dataset videos are split for background usage in experiments beyond being 'sampled'.
Hardware Specification	No	The paper mentions "a hardware donation by NVIDIA through the Academic Grant Program" in the acknowledgments, but does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using a "publicly released standard Pytorch implementation" but does not specify the version number of PyTorch or any other software libraries used (e.g., Python, CUDA, numpy, etc.).
Experiment Setup	Yes	All hyperparameters of SAC are fixed across tasks and shown in Table B.1. Table B.1: Shared hyperparameters across tasks [lists Replay buffer capacity 100 000, Optimizer Adam, Critic Learning rate 10 3, Critic Q-function EMA 0.01, Critic target update freq 2, Actor learning rate 10 3, Actor update frequency 2, Actor log stddev bounds [-10 2], Temperature learning rate 10 3, Initial steps 1000, Discount 0.99, Initial temperature 0.1, Learning rate for ϕo, go, qψ, dυ and fo 10 4, Encoder and projection model EMA τ 0.05, Coefficient α 0.1, Coefficient λ 0.001, Chunk length 2]. Table B.2: Task-specific hyperparameters [lists Action Repeats and Batchsize for each task].