reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Joint Causal Inference from Multiple Contexts

Authors: Joris M. Mooij, Sara Magliacane, Tom Claassen

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate different JCI implementations on synthetic data and on flow cytometry protein expression data and conclude that JCI implementations can considerably outperform state-of-the-art causal discovery algorithms.
Researcher Affiliation	Collaboration	Joris M. Mooij EMAIL Korteweg-De Vries Institute, University of Amsterdam Postbox 94248, 1090 GE Amsterdam, The Netherlands Sara Magliacane EMAIL MIT-IBM Watson AI Lab, IBM Research 75 Binney St, Cambridge, MA 02142, USA Tom Claassen EMAIL Institute for Computing and Information Sciences, Radboud University Nijmegen Postbox 9010, 6500 GL Nijmegen, The Netherlands
Pseudocode	No	The paper describes algorithms such as ASD, FCI, LCD, and ICP, but it does not provide any explicit pseudocode blocks or algorithm listings within its content. The methodology is explained through descriptive text.
Open Source Code	Yes	The source code that we used for producing the results and plots in this section is provided under a free and open source license as Online Appendix 1.
Open Datasets	Yes	We evaluate diﬀerent JCI implementations on synthetic data and on ﬂow cytometry protein expression data and conclude that JCI implementations can considerably outperform state-of-the-art causal discovery algorithms. We experimented both with simulated data with perfectly known ground truth and with real-world data where the ground truth is only known approximately. In this subsection, we present an application of the Joint Causal Inference framework on real-world data: the ﬂow cytometry data of Sachs et al. (2005).
Dataset Splits	No	The paper describes the generation of synthetic data (
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. It mentions 'runtimes' but not the underlying hardware.
Software Dependencies	No	We implemented diﬀerent variants of the FCI algorithm by adapting the implementation in the R package pcalg (Kalisch et al., 2012). We also compare with the ICP function in the R package Invariant Causal Prediction (Peters et al., 2016). The paper mentions specific R packages used but does not provide version numbers for these packages or for R itself.
Experiment Setup	Yes	We simulated random linear-Gaussian SCMs with p system variables and q context variables. We considered both the acyclic setting and the cyclic one. We simulated stochastic interventions of two diﬀerent intervention types: mechanism changes, and perfect interventions. Random causal graphs were simulated by drawing directed edges independently between system variables with probability ϵ. For the acyclic models, we only allowed directed edges i1 i2 for i1 < i2 with i1, i2 I. For cyclic models, we allowed directed edges i1 i2 for i1 = i2 with i1, i2 I, and subsequently selected only the graphs in which at least one cycle exists. We drew bidirected edges independently between all unordered pairs of system variables with probability η, and associated each bidirected edge with a separate latent confounding variable. For each context variable, we randomly selected a single system variable as its target, while ensuring that each system variable has at most one context variable as its direct cause. We sampled all linear coeﬃcients between system variables, context variables and confounders from the uniform distribution on [−1.5, 0.5] ∪ [0.5, 1.5]. The exogenous variables (error terms) were sampled independently from the standard-normal distribution. To ensure that system variables have comparable scales, we rescaled the weight matrix such that each system variable would have variance 1 if all its direct causes would be i.i.d. standard-normal. We used binary context variables in a diagonal design. This means that for each random SCM, we simulated q + 1 contexts, with the ﬁrst context being purely observational (i.e., Ck = 0 for all k ∈ {1, . . . , q}), and the other q contexts corresponding with one of the context variables turned on (say Ck = 1 for some k ∈ {1, . . . , q}) and the others turned oﬀ (Ck = 0 for the other k ∈ {1, . . . , q} \{k }). We either took all interventions to be mechanism changes, or all interventions to be perfect. For mechanism changes, we simply add the value of the parent context variable to the structural equation (i.e., this corresponds with adding a constant oﬀset of 1 to the intervention target variable when the intervention is turned on). For perfect interventions, we additionally set the linear coeﬃcients of incoming edges on the intervention target to zero. Finally, we sampled Nc = 500 samples for each context, i.e., N = 500(q + 1) samples in total. We used fixed sample size and simply used a global threshold α = 0.01.