Granger Causal Interaction Skill Chains

Authors: Caleb Chuck, Kevin Black, Aditya Arjun, Yuke Zhu, Scott Niekum

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate COIn S on a robotic pushing task with obstacles a challenging domain where other RL and HRL methods fall short. We also demonstrate the transferability of skills learned by COIn S, using variants of Breakout, a common RL benchmark, and show 2-3x improvement in both sample efficiency and final performance compared to standard RL baselines.
Researcher Affiliation Academia Caleb Chuck EMAIL University of Texas at Austin Kevin Black University of California Berkeley Aditya Arjun University of Texas at Austin Yuke Zhu University of Texas at Austin Scott Niekum University of Massachusetts Amherst
Pseudocode Yes Algorithm box 9 describes the algorithm.
Open Source Code No The paper does not contain any explicit statement about providing access to their source code, nor does it include a link to a code repository. It mentions third-party tools like 'robosuite' but not their own implementation code.
Open Datasets Yes We systematically evaluate COIn S in two domains: 1) an adapted version of the common Atari baseline Breakout (Bellemare et al., 2013) (Figure 1 and Appendix A.1) and 2) a simulated Robot pushing domain in robosuite (Zhu et al., 2020)
Dataset Splits No The paper describes using a 'dataset D of state, action, next state tuples' collected during skill learning and mentions training and evaluation. However, it does not specify explicit train/test/validation splits for a fixed dataset, as it operates within continuous simulation environments where data is generated during training.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. It mentions 'robotic pushing task' and 'simulated', but no specific hardware.
Software Dependencies No The paper mentions several software components and algorithms used (e.g., 'Rainbow', 'Soft Actor-Critic', 'Point Net style architecture', 'robosuite'). However, it does not provide specific version numbers for these components as they were used in the authors' implementation.
Experiment Setup Yes In Sections A.1 and A.2 we briefly describe some of the hyperparameter details related to the COIn S training process, but in this section, we provide additional context for the sensitivity of hyperparameters described in Table 6 and throughout the work. In general, the volume of hyperparameters comes from the reality that a hierarchical causal algorithm is a complex system. RL, causal learning, and hierarchy all contribute hyperparameters, and though the overall algorithm may not be sensitive to most of them, some value must be assigned to each.