reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Efficient Adjustment in Causal Graphs

Authors: Janine Witte, Leonard Henckel, Marloes H. Maathuis, Vanessa Didelez

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For further illustration, we carried out a small simulation study in which we generated data according to a causal linear model compatible with Figure 4(b). 1 000 datasets with 40 observations each were generated and given as input to local IDA and optimal IDA, together with the CPDAG in Figure 4(a). In order to compare the performance of optimal versus local IDA in ﬁnite sample settings, we carried out a more extensive simulation study.
Researcher Affiliation	Academia	Janine Witte EMAIL Leibniz Institute for Prevention Research and Epidemiology BIPS, Bremen, Germany and Faculty of Mathematics and Computer Science, University of Bremen, Germany Leonard Henckel EMAIL Seminar for Statistics, ETH Zurich, Switzerland Marloes H. Maathuis EMAIL Seminar for Statistics, ETH Zurich, Switzerland Vanessa Didelez EMAIL Leibniz Institute for Prevention Research and Epidemiology BIPS, Bremen, Germany and Faculty of Mathematics and Computer Science, University of Bremen, Germany
Pseudocode	Yes	Algorithm 1 Local or semi-local IDA (Maathuis et al., 2009; Perkovi c et al., 2017). Algorithm 2 Optimal IDA.
Open Source Code	Yes	It is implemented in the R-package pcalg. Optimal IDA has been implemented in the R package pcalg (Kalisch et al., 2012, 2019). The R-code (R Core Team, 2019) for reproducing Figure 5 is available in the Online Supplement. R code for reproducing the simulation study is available in the Online Supplement
Open Datasets	No	In each scenario, the following was repeated 1 000 times (R code for reproducing the simulation study is available in the Online Supplement): A DAG D, with CPDAG G, with p nodes and d expected neighbours per node was randomly chosen such that G was non-amenable relative to two randomly chosen nodes (X, Y ) and such that min(abs((τxy(D))D [G])) was non-zero. (Note that the DAG with its unique true causal eﬀect was simulated for convenience only. Conceptually, we drew directly from the space of CPDAGs, which is why we consider the whole multiset of possible eﬀects to be the truth .) The following was then repeated 100 times: A dataset with n observations was generated from a linear causal model on D where the non-zero coeﬃcients were randomly chosen from a uniform distribution on [−1, 0.1] [0.1, 1].
Dataset Splits	No	1 000 datasets with 40 observations each were generated and given as input to local IDA and optimal IDA, together with the CPDAG in Figure 4(a). ... A dataset with n observations was generated from a linear causal model on D where the non-zero coeﬃcients were randomly chosen from a uniform distribution on [−1, 0.1] [0.1, 1].
Hardware Specification	No	The paper does not provide specific hardware details used for running the experiments.
Software Dependencies	Yes	It is implemented in the R-package pcalg (Kalisch et al., 2012, 2019). The R-code (R Core Team, 2019) for reproducing Figure 5 is available in the Online Supplement.
Experiment Setup	Yes	We investigated 24 scenarios by considering all combinations of the following parameters: number of nodes p {10, 20, 50, 100}, expected number of neighbours per node d {2, 3, 4}, and sample size n {100, 1 000}. In each scenario, the following was repeated 1 000 times... A dataset with n observations was generated from a linear causal model on D where the non-zero coeﬃcients were randomly chosen from a uniform distribution on [−1, 0.1] [0.1, 1].