reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Latent Dynamic Robust Representations for World Models

Authors: Ruixiang Sun, Hongyu Zang, Xin Li, Riashat Islam

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation demonstrates significant performance improvements over existing methods in a range of visually complex control tasks such as Maniskill (Gu et al., 2023) with exogenous distractors from the Matterport environment.
Researcher Affiliation	Collaboration	1Beijing Institute of Technology, China 2Dream Fold AI, Canada.
Pseudocode	Yes	Algorithm 1 HRSSM
Open Source Code	Yes	Our code is avaliable at https://github.com/ bit1029public/HRSSM.
Open Datasets	Yes	We perform our experiments in three distinct settings: i) a set of Mu Jo Co tasks (Todorov et al., 2012) provided by Deepmind Control(DMC) suite (Tassa et al., 2018), ii) a variant of Deep Mind Control Suite where the background is replaced with grayscale natural videos from Kinetics dataset (Kay et al., 2017), termed as Distracted Deep Mind Control Suite (Zhang et al., 2018), and iii) a benchmark based on the Maniskill2 (Gu et al., 2023), enhanced with realistic images of human homes (Chang et al., 2017) as backgrounds and was introduced in (Zhu et al., 2023).
Dataset Splits	No	The paper describes experiments conducted in various environments but does not provide explicit training, validation, or test dataset splits in terms of percentages or sample counts. Data is typically generated through interaction with the environment in RL.
Hardware Specification	Yes	We compare the wall-clock traning time of our method and Dreamer V3 in the Realistic Maniskill environment, with the use of a sever with NVidia A100SXM4 (40 GB memory) GPU.
Software Dependencies	No	The paper mentions using an 'unofficial open-sourced pytorch version of Dreamer V3(NM512, 2023)', but it does not specify the version numbers for PyTorch or other key software components used for reproducibility.
Experiment Setup	Yes	Table 3. Our model’s hyperparameters, which are the same across all tasks in DMControl and Realistic Maniskill. This table lists various hyperparameters such as Replay capacity (FIFO) 10^6, Batch size B 16, Batch length T 64, Learning rate 10^-4, Mask ratio 50%, Cube spatial size h w 10 10, etc.