reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neural Causal Structure Discovery from Interventions

Authors: Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Bernhard Schölkopf, Michael Curtis Mozer, Christopher Pal, Yoshua Bengio

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository. Our model outperforms previous methods on both synthetic and naturalistic Causal Bayesian Networks (CBNs). We evaluate the performance of sdi across a range of experiments with increasing difficulties. We then evaluate the proposed method on real-world datasets from the Bn Learn dataset repository.
Researcher Affiliation	Collaboration	Nan Rosemary Keú EMAIL Google Deepmind Olexa Bilaniukú EMAIL Mila Anirudh Goyal EMAIL Google Deepmind Stefan Bauer EMAIL Technical University of Munich Hugo Larochelle EMAIL Google Deepmind Bernhard Schölkopf EMAIL Max Planck Institute for Intelligent Systems Michael C. Mozer EMAIL Google Deepmind Chris Pal EMAIL Mila, Polytechnique Montreal Yoshua Bengio EMAIL Mila, University of Montreal, CIFAR Senior Fellow
Pseudocode	No	The paper describes the proposed method, SDI, in Section 4, detailing Phase 1 (Distribution Fitting) and Phase 2 (Graph Learning) through prose and a diagram (Figure 1), but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	No explicit statement or link for open-source code was found in the paper.
Open Datasets	Yes	The Bayesian Network Repository (www.bnlearn.com/bnrepository) is a collection of commonly-used causal Bayesian networks from the literature, suitable for Bayesian and causal learning benchmarks. We evaluate the proposed method on the Earthquake (Korb & Nicholson, 2010), Cancer (Korb & Nicholson, 2010), Asia (Lauritzen & Spiegelhalter, 1988) and Sachs (Sachs et al., 2005) datasets (M =5, 5, 8 and 11-variables respectively, maximum in-degree 3) in the Bn Learn dataset repository. We evaluate sdi on Barley (Kristensen & Rasmussen, 2002) (M = 48) and Alarm (Beinlich et al., 1989) (M = 37) from the Bn Learn repository.
Dataset Splits	No	The paper mentions using "observational and interventional data" and generating "synthetic data" and using "real-world datasets." It also refers to "a small set of holdout interventional data samples" in Section 4.2.1 and "fresh interventional data" in Section 6.3. However, it does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or references to standard predefined splits).
Hardware Specification	No	No specific hardware details (such as GPU/CPU models, processor types, or cloud instance specifications) were mentioned in the paper for running the experiments.
Software Dependencies	No	The paper describes algorithms and methods (e.g., MLPs, SGD, Adam optimizer) but does not list specific software libraries or their version numbers used for implementation, which would be necessary for reproducibility.
Experiment Setup	Yes	Unless speciﬁed otherwise, we maintained identical hyperparameters for all experiments. For all experiments, we set the DAG penalty to 0.5 and sparsity penalty to 0.1. The experiments involving sdi were executed for a total of 50,000 steps, as illustrated in Figure 10, and most experiments reached convergence within that timeframe. The parameters of the neural network are initialized orthogonally within the range of (-2.5, 2.5). The biases are initialized uniformly between (-1.1, 1.1).