reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Granger Causal Interaction Skill Chains

Authors: Caleb Chuck, Kevin Black, Aditya Arjun, Yuke Zhu, Scott Niekum

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate COIn S on a robotic pushing task with obstacles a challenging domain where other RL and HRL methods fall short. We also demonstrate the transferability of skills learned by COIn S, using variants of Breakout, a common RL benchmark, and show 2-3x improvement in both sample efficiency and final performance compared to standard RL baselines.
Researcher Affiliation	Academia	Caleb Chuck EMAIL University of Texas at Austin Kevin Black University of California Berkeley Aditya Arjun University of Texas at Austin Yuke Zhu University of Texas at Austin Scott Niekum University of Massachusetts Amherst
Pseudocode	Yes	Algorithm box 9 describes the algorithm.
Open Source Code	No	The paper does not contain any explicit statement about providing access to their source code, nor does it include a link to a code repository. It mentions third-party tools like 'robosuite' but not their own implementation code.
Open Datasets	Yes	We systematically evaluate COIn S in two domains: 1) an adapted version of the common Atari baseline Breakout (Bellemare et al., 2013) (Figure 1 and Appendix A.1) and 2) a simulated Robot pushing domain in robosuite (Zhu et al., 2020)
Dataset Splits	No	The paper describes using a 'dataset D of state, action, next state tuples' collected during skill learning and mentions training and evaluation. However, it does not specify explicit train/test/validation splits for a fixed dataset, as it operates within continuous simulation environments where data is generated during training.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. It mentions 'robotic pushing task' and 'simulated', but no specific hardware.
Software Dependencies	No	The paper mentions several software components and algorithms used (e.g., 'Rainbow', 'Soft Actor-Critic', 'Point Net style architecture', 'robosuite'). However, it does not provide specific version numbers for these components as they were used in the authors' implementation.
Experiment Setup	Yes	In Sections A.1 and A.2 we briefly describe some of the hyperparameter details related to the COIn S training process, but in this section, we provide additional context for the sensitivity of hyperparameters described in Table 6 and throughout the work. In general, the volume of hyperparameters comes from the reality that a hierarchical causal algorithm is a complex system. RL, causal learning, and hierarchy all contribute hyperparameters, and though the overall algorithm may not be sensitive to most of them, some value must be assigned to each.