reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Representations for Pixel-based Control: What Matters and Why?

Authors: Manan Tomar, Utkarsh Aashu Mishra, Amy Zhang, Matthew E. Taylor

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments across multiple settings, including the Mu Jo Co domains from DMC Suite (Tassa et al., 2018) with natural distractors (Zhang et al., 2018; Kay et al., 2017; Stone et al., 2021), and Atari100K Kaiser et al. (2019) from ALE (Bellemare et al., 2013).
Researcher Affiliation	Collaboration	Manan Tomar University of Alberta Amii Utkarsh A. Mishra University of Alberta Amii Amy Zhang FAIR, Menlo Park University of California, Berkeley Matthew E. Taylor University of Alberta Amii
Pseudocode	No	The paper describes methods and losses using textual descriptions and mathematical equations, such as LBaseline and LDREAMER, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code available: https://github.com/Utkarsh Mishra04/pixel-representations-RL
Open Datasets	Yes	We conduct experiments across multiple settings, including the Mu Jo Co domains from DMC Suite (Tassa et al., 2018) with natural distractors (Zhang et al., 2018; Kay et al., 2017; Stone et al., 2021), and Atari100K Kaiser et al. (2019) from ALE (Bellemare et al., 2013). ... Kinetic dataset https://github.com/Showmax/kinetics-downloader
Dataset Splits	No	The paper describes training and evaluating agents within reinforcement learning environments (DMC Suite and Atari100K) over 'environment steps' or 'episodes', which is standard for RL. However, it does not specify explicit static training/test/validation dataset splits with percentages or sample counts, as such splits are not typically applicable in the same way as for supervised learning tasks using fixed datasets.
Hardware Specification	Yes	All experiments were conducted on either system conﬁguration of: 1. 6 CPU cores of Intel Gold 6148 Skylake@2.4 GHz, one NVidia V100SXM2 (16G memory) GPU and 84 GB RAM. 2. 6 CPU cores of Intel Xeon Gold 5120 Skylake@2.2GHz, one NVIDIA V100 Volta (16GB HBM2 memory) GPU and 84 GB RAM
Software Dependencies	No	The paper mentions using the SAC algorithm and references existing open-source implementations for certain architectures, but it does not specify versions for core software dependencies like Python, deep learning frameworks (e.g., PyTorch, TensorFlow), or CUDA libraries.
Experiment Setup	Yes	The full set of hyperparameters used for the baseline experiments are provided in Table 3 below. Table 3: Hyperparameters for Baseline and related ablations. (lists observation shape, latent dimension, replay buffer size, initial steps, stacked frames, action repeat, SAC hidden units, transition network details, reward network details, evaluation episodes, optimizer, beta values, learning rates, batch size, Q function EMA, critic target update freq, convolutional layers, number of filters, non-linearity, encoder EMA, discount gamma).