On Efficient Adjustment in Causal Graphs

Authors: Janine Witte, Leonard Henckel, Marloes H. Maathuis, Vanessa Didelez

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For further illustration, we carried out a small simulation study in which we generated data according to a causal linear model compatible with Figure 4(b). 1 000 datasets with 40 observations each were generated and given as input to local IDA and optimal IDA, together with the CPDAG in Figure 4(a). In order to compare the performance of optimal versus local IDA in finite sample settings, we carried out a more extensive simulation study.
Researcher Affiliation Academia Janine Witte EMAIL Leibniz Institute for Prevention Research and Epidemiology BIPS, Bremen, Germany and Faculty of Mathematics and Computer Science, University of Bremen, Germany Leonard Henckel EMAIL Seminar for Statistics, ETH Zurich, Switzerland Marloes H. Maathuis EMAIL Seminar for Statistics, ETH Zurich, Switzerland Vanessa Didelez EMAIL Leibniz Institute for Prevention Research and Epidemiology BIPS, Bremen, Germany and Faculty of Mathematics and Computer Science, University of Bremen, Germany
Pseudocode Yes Algorithm 1 Local or semi-local IDA (Maathuis et al., 2009; Perkovi c et al., 2017). Algorithm 2 Optimal IDA.
Open Source Code Yes It is implemented in the R-package pcalg. Optimal IDA has been implemented in the R package pcalg (Kalisch et al., 2012, 2019). The R-code (R Core Team, 2019) for reproducing Figure 5 is available in the Online Supplement. R code for reproducing the simulation study is available in the Online Supplement
Open Datasets No In each scenario, the following was repeated 1 000 times (R code for reproducing the simulation study is available in the Online Supplement): A DAG D, with CPDAG G, with p nodes and d expected neighbours per node was randomly chosen such that G was non-amenable relative to two randomly chosen nodes (X, Y ) and such that min(abs((τxy(D))D [G])) was non-zero. (Note that the DAG with its unique true causal effect was simulated for convenience only. Conceptually, we drew directly from the space of CPDAGs, which is why we consider the whole multiset of possible effects to be the truth .) The following was then repeated 100 times: A dataset with n observations was generated from a linear causal model on D where the non-zero coefficients were randomly chosen from a uniform distribution on [−1, 0.1] [0.1, 1].
Dataset Splits No 1 000 datasets with 40 observations each were generated and given as input to local IDA and optimal IDA, together with the CPDAG in Figure 4(a). ... A dataset with n observations was generated from a linear causal model on D where the non-zero coefficients were randomly chosen from a uniform distribution on [−1, 0.1] [0.1, 1].
Hardware Specification No The paper does not provide specific hardware details used for running the experiments.
Software Dependencies Yes It is implemented in the R-package pcalg (Kalisch et al., 2012, 2019). The R-code (R Core Team, 2019) for reproducing Figure 5 is available in the Online Supplement.
Experiment Setup Yes We investigated 24 scenarios by considering all combinations of the following parameters: number of nodes p {10, 20, 50, 100}, expected number of neighbours per node d {2, 3, 4}, and sample size n {100, 1 000}. In each scenario, the following was repeated 1 000 times... A dataset with n observations was generated from a linear causal model on D where the non-zero coefficients were randomly chosen from a uniform distribution on [−1, 0.1] [0.1, 1].