Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning

Authors: Xinyue Wang, Biwei Huang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on numerical simulations and real-world robotic manipulation tasks demonstrate that WM3C significantly outperforms existing methods in identifying latent processes, improving policy learning, and generalizing to unseen tasks.
Researcher Affiliation Academia Xinyue Wang1, Biwei Huang1 1University of California San Diego EMAIL
Pseudocode No The paper includes mathematical formulations and descriptions of its algorithm, but does not contain any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code No 1Project website: https://www.charonwangg.com/project/wm3c
Open Datasets Yes We conduct experiments in the robotic simulation environment, Meta-world (Yu et al., 2019).
Dataset Splits Yes We split all possible combinations into a training set for the i.i.d component identification test and a test set for the o.o.d component prediction test. ... We take 18 training tasks that follow this structure and 9 test tasks that either are new combinations of known language components or mixtures of known and unknown language components.
Hardware Specification Yes All experiments are conducted a 4 Nvidia 3090 GPU.
Software Dependencies No We utilize the state-of-the-art model-based reinforcement learning algorithm, Dreamer V3 (Hafner et al., 2023)... For WM3C CNN, we build upon the JAX implementation of Dreamer V3 small... For Dreamer V3, we take the official JAX implementation of Dreamer V3 (Hafner et al., 2023) from https://github.com/danijar/dreamerv3... For visual-based multi-task SAC (MT-SAC), we take the visual-based SAC implementation from https://github.com/Karl Xing/RL-Visual-Continuous-Control... The paper mentions specific software frameworks and implementations but does not provide version numbers for key dependencies like JAX or other libraries.
Experiment Setup Yes Table 2: Hyperparameters for WM3C CNN implementation in Meta-world. ... Table 3: MAE and Vi T hyperparameters for WM3C MAE in Meta-world.