reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Self-supervised Color Generalization in Reinforcement Learning

Authors: Matthias Weissenbacher, Evangelos Routis, Yoshinobu Kawahara

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate our method in the Minigrid, Procgen, and Deep Mind Control suites and find improved color sensitivity and generalisation.
Researcher Affiliation	Collaboration	Matthias Weissenbacher EMAIL Riken Center for Advanced Intelligence Project Pyr-SAI Labs Japan Evangelos Routis EMAIL Causaly London United Kingdom Yoshinobu Kawahara Riken Center for Advanced Intelligence Project Osaka University Japan
Pseudocode	No	The paper describes algorithms like rDMD and CiL mathematically and in narrative text, but does not present them in a structured pseudocode or algorithm block.
Open Source Code	Yes	In section 4.2 we perform our main experiments on the Procgen environment. The code is made public at Git Hub.
Open Datasets	Yes	We empirically evaluate our method in the Minigrid, Procgen, and Deep Mind Control suites... The Lava Crossing environment, a standard in the Mini Grid toolkit (Chevalier-Boisvert et al., 2019)... The Procgen benchmark consists of sixteen procedurally generated games... Procgen generalization benchmark (Cobbe et al., 2020)... Deepmind Control suite (DMControl) (Tassa et al., 2018).
Dataset Splits	Yes	Following the setup from (Cobbe et al., 2020), agents are trained on a fixed set of n = 200 levels (generated using seeds from 1 to 200) and tested on the full distribution of levels (generated by sampling seeds uniformly at random from all computer integers).
Hardware Specification	Yes	All experiments were performed on NVIDIA GPU A-100 or V-100.
Software Dependencies	No	The paper mentions 'torch.svd' and algorithms 'PPO/Dr AC' and 'SAC' but does not provide specific version numbers for these or other software libraries.
Experiment Setup	Yes	We summarize the hyperparameter choices in Table (5). Table 5: Architecture and hyper-parameter choices for Ci L on Procgen, DMControl, Minigrid based on (Raileanu et al., 2020), (Hansen & Wang, 2021), and (Jiang et al., 2021), respectively. Channels refer to the category channels. We use the algorithms in the code-base without any hyper-parameter changes except for reduction of hidden-dim of the actor-critic networks to 64. The patch size follows the convention in Vision Transformers; for a 64x64 pixel input, we use 8x8 patches.