Neural Causal Structure Discovery from Interventions

Authors: Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Bernhard Schölkopf, Michael Curtis Mozer, Christopher Pal, Yoshua Bengio

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository. Our model outperforms previous methods on both synthetic and naturalistic Causal Bayesian Networks (CBNs). We evaluate the performance of sdi across a range of experiments with increasing difficulties. We then evaluate the proposed method on real-world datasets from the Bn Learn dataset repository.
Researcher Affiliation Collaboration Nan Rosemary Keú EMAIL Google Deepmind Olexa Bilaniukú EMAIL Mila Anirudh Goyal EMAIL Google Deepmind Stefan Bauer EMAIL Technical University of Munich Hugo Larochelle EMAIL Google Deepmind Bernhard Schölkopf EMAIL Max Planck Institute for Intelligent Systems Michael C. Mozer EMAIL Google Deepmind Chris Pal EMAIL Mila, Polytechnique Montreal Yoshua Bengio EMAIL Mila, University of Montreal, CIFAR Senior Fellow
Pseudocode No The paper describes the proposed method, SDI, in Section 4, detailing Phase 1 (Distribution Fitting) and Phase 2 (Graph Learning) through prose and a diagram (Figure 1), but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code No No explicit statement or link for open-source code was found in the paper.
Open Datasets Yes The Bayesian Network Repository (www.bnlearn.com/bnrepository) is a collection of commonly-used causal Bayesian networks from the literature, suitable for Bayesian and causal learning benchmarks. We evaluate the proposed method on the Earthquake (Korb & Nicholson, 2010), Cancer (Korb & Nicholson, 2010), Asia (Lauritzen & Spiegelhalter, 1988) and Sachs (Sachs et al., 2005) datasets (M =5, 5, 8 and 11-variables respectively, maximum in-degree 3) in the Bn Learn dataset repository. We evaluate sdi on Barley (Kristensen & Rasmussen, 2002) (M = 48) and Alarm (Beinlich et al., 1989) (M = 37) from the Bn Learn repository.
Dataset Splits No The paper mentions using "observational and interventional data" and generating "synthetic data" and using "real-world datasets." It also refers to "a small set of holdout interventional data samples" in Section 4.2.1 and "fresh interventional data" in Section 6.3. However, it does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or references to standard predefined splits).
Hardware Specification No No specific hardware details (such as GPU/CPU models, processor types, or cloud instance specifications) were mentioned in the paper for running the experiments.
Software Dependencies No The paper describes algorithms and methods (e.g., MLPs, SGD, Adam optimizer) but does not list specific software libraries or their version numbers used for implementation, which would be necessary for reproducibility.
Experiment Setup Yes Unless specified otherwise, we maintained identical hyperparameters for all experiments. For all experiments, we set the DAG penalty to 0.5 and sparsity penalty to 0.1. The experiments involving sdi were executed for a total of 50,000 steps, as illustrated in Figure 10, and most experiments reached convergence within that timeframe. The parameters of the neural network are initialized orthogonally within the range of (-2.5, 2.5). The biases are initialized uniformly between (-1.1, 1.1).