Hierarchical Reinforcement Learning with Targeted Causal Interventions
Authors: Mohammadsadegh Khorasani, Saber Salehkaleybar, Negar Kiyavash, Matthias Grossglauser
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results on HRL tasks also illustrate that our proposed framework outperforms existing work in terms of training cost. ... In this section, we compare our proposed methods with previous work 7. First, we compare our proposed ranking rules with a random strategy using synthetic data. ... To evaluate the effectiveness of our proposed methods in a realistic setting, we conducted experiments using a complex long-horizon game called 2D-Minecraft (Sohn et al., 2018). |
| Researcher Affiliation | Academia | 1School of Computer and Communication Sciences, EPFL, Lausanne, Switzerland 2Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, The Netherlands 3College of Management of Technology, EPFL, Lausanne, Switzerland. |
| Pseudocode | Yes | Algorithm 1 Hierarchical Reinforcement Learning via Causality (HRC) |
| Open Source Code | Yes | The code for all experiments is available at https:// github.com/sadegh16/HRC. |
| Open Datasets | Yes | To evaluate the effectiveness of our proposed methods in a realistic setting, we conducted experiments using a complex long-horizon game called 2D-Minecraft (Sohn et al., 2018). ... For more empirical comparison, we conducted experiments on the Craft World environment (non-binary) provided by (Wang et al., 2024) (Ski LD). |
| Dataset Splits | No | The paper describes generating synthetic graphs for experiments and using the 2D-Minecraft and Craft World environments, but does not explicitly provide details about training/test/validation dataset splits, percentages, or specific split methodologies for any of these. |
| Hardware Specification | Yes | We utilized a Linux server with Intel Xeon CPU E5-2680 v3 (24 cores) operating at 2.50GHz with 377 GB DDR4 of memory and Nvidia Titan X Pascal GPU. |
| Software Dependencies | No | The paper mentions 'LASSO L1-RATIO' in Table 3 for hyperparameters, which implies the use of a software library for LASSO, but it does not specify any software names with version numbers for any libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | Table 3 shows the key hyperparameters used in our setup. Additional hyperparameter values can be found in the supplementary materials. |