Variational Causal Dynamics: Discovering Modular World Models from Interventions

Authors: Anson Lei, Bernhard Schölkopf, Ingmar Posner

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In evaluations on simulated environments with state and image observations, we show that VCD is able to successfully identify causal variables, and to discover consistent causal structures across different environments. Moreover, given a small number of observations in a previously unseen, intervened environment, VCD is able to identify the sparse changes in the dynamics and to adapt efficiently. In doing so, VCD significantly extends the capabilities of the current state-of-the-art in latent world models while also comparing favourably in terms of prediction accuracy.
Researcher Affiliation Academia Anson Lei EMAIL Applied AI Lab University of Oxford Bernhard Schölkopf EMAIL MPI for Intelligent Systems Tübingen Ingmar Posner EMAIL Applied AI Lab University of Oxford
Pseudocode No The paper describes the methodology and training process in detail using equations and textual explanations (e.g., Section 4.3 Training, Section 4.4 Adaptation) but does not include a formal pseudocode or algorithm block.
Open Source Code Yes The code for the experiments are available at https://github.com/applied-ai-lab/VCD.
Open Datasets No Dataset We evaluate VCD on a simulated dataset of a 2-D multibody system which contains four particles that affect each other via a spring or an electrostatic-like force. The configuration of the environment is shown in Fig.3. The action a R2 is an external force that applies to particle 4. The environment is designed such that there is an unambiguous ground-truth causal graph between the causal variables and well-defined changes in the dynamics.
Dataset Splits Yes VCD and the baselines are trained on a dataset composed of 2000 trajectories from each of the undisturbed environment and five intervened environments. The models are evaluated on trajectories from a validation set drawn from the training environments. In both experiments, the models are trained on a training set of 2000 trajectories from each of the six environments and evaluated on a validation set of 400 unseen trajectories.
Hardware Specification Yes All models are trained on a single Nvidia Tesla V100 GPU.
Software Dependencies No The paper mentions using GRUs and the ADAM optimizer but does not specify software versions for these or other libraries (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup Yes In both experiments, the training objective is maximised using the ADAM optimiser (Kingma & Ba, 2014) with learning rate 10 3 for mixed-state, and 10 4 for images. In both environments, we clip the log variance to 3, with a batch size of two trajectories from each of six environments with T = 50. In VCD, the hyperparameters λG, λI are both set to 0.01.