Variational Causal Dynamics: Discovering Modular World Models from Interventions
Authors: Anson Lei, Bernhard Schölkopf, Ingmar Posner
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In evaluations on simulated environments with state and image observations, we show that VCD is able to successfully identify causal variables, and to discover consistent causal structures across different environments. Moreover, given a small number of observations in a previously unseen, intervened environment, VCD is able to identify the sparse changes in the dynamics and to adapt efficiently. In doing so, VCD significantly extends the capabilities of the current state-of-the-art in latent world models while also comparing favourably in terms of prediction accuracy. |
| Researcher Affiliation | Academia | Anson Lei EMAIL Applied AI Lab University of Oxford Bernhard Schölkopf EMAIL MPI for Intelligent Systems Tübingen Ingmar Posner EMAIL Applied AI Lab University of Oxford |
| Pseudocode | No | The paper describes the methodology and training process in detail using equations and textual explanations (e.g., Section 4.3 Training, Section 4.4 Adaptation) but does not include a formal pseudocode or algorithm block. |
| Open Source Code | Yes | The code for the experiments are available at https://github.com/applied-ai-lab/VCD. |
| Open Datasets | No | Dataset We evaluate VCD on a simulated dataset of a 2-D multibody system which contains four particles that affect each other via a spring or an electrostatic-like force. The configuration of the environment is shown in Fig.3. The action a R2 is an external force that applies to particle 4. The environment is designed such that there is an unambiguous ground-truth causal graph between the causal variables and well-defined changes in the dynamics. |
| Dataset Splits | Yes | VCD and the baselines are trained on a dataset composed of 2000 trajectories from each of the undisturbed environment and five intervened environments. The models are evaluated on trajectories from a validation set drawn from the training environments. In both experiments, the models are trained on a training set of 2000 trajectories from each of the six environments and evaluated on a validation set of 400 unseen trajectories. |
| Hardware Specification | Yes | All models are trained on a single Nvidia Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions using GRUs and the ADAM optimizer but does not specify software versions for these or other libraries (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | In both experiments, the training objective is maximised using the ADAM optimiser (Kingma & Ba, 2014) with learning rate 10 3 for mixed-state, and 10 4 for images. In both environments, we clip the log variance to 3, with a batch size of two trajectories from each of six environments with T = 50. In VCD, the hyperparameters λG, λI are both set to 0.01. |