Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Authors: Xinyue Wang, Biwei Huang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on numerical simulations and real-world robotic manipulation tasks demonstrate that WM3C significantly outperforms existing methods in identifying latent processes, improving policy learning, and generalizing to unseen tasks. |
| Researcher Affiliation | Academia | Xinyue Wang1, Biwei Huang1 1University of California San Diego EMAIL |
| Pseudocode | No | The paper includes mathematical formulations and descriptions of its algorithm, but does not contain any explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | No | 1Project website: https://www.charonwangg.com/project/wm3c |
| Open Datasets | Yes | We conduct experiments in the robotic simulation environment, Meta-world (Yu et al., 2019). |
| Dataset Splits | Yes | We split all possible combinations into a training set for the i.i.d component identification test and a test set for the o.o.d component prediction test. ... We take 18 training tasks that follow this structure and 9 test tasks that either are new combinations of known language components or mixtures of known and unknown language components. |
| Hardware Specification | Yes | All experiments are conducted a 4 Nvidia 3090 GPU. |
| Software Dependencies | No | We utilize the state-of-the-art model-based reinforcement learning algorithm, Dreamer V3 (Hafner et al., 2023)... For WM3C CNN, we build upon the JAX implementation of Dreamer V3 small... For Dreamer V3, we take the official JAX implementation of Dreamer V3 (Hafner et al., 2023) from https://github.com/danijar/dreamerv3... For visual-based multi-task SAC (MT-SAC), we take the visual-based SAC implementation from https://github.com/Karl Xing/RL-Visual-Continuous-Control... The paper mentions specific software frameworks and implementations but does not provide version numbers for key dependencies like JAX or other libraries. |
| Experiment Setup | Yes | Table 2: Hyperparameters for WM3C CNN implementation in Meta-world. ... Table 3: MAE and Vi T hyperparameters for WM3C MAE in Meta-world. |