Order-Independent Constraint-Based Causal Structure Learning

Authors: Diego Colombo, Marloes H. Maathuis

JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare the PC-, FCI-, and RFCI-algorithms and their modifications in simulation studies and on a yeast gene expression data set. We show that our modifications yield similar performance in low-dimensional settings and improved performance in high-dimensional settings.
Researcher Affiliation Academia Diego Colombo EMAIL Marloes H. Maathuis EMAIL Seminar for Statistics ETH Zurich 8092 Zurich, Switzerland
Pseudocode Yes Algorithm 3.1 The PC-algorithm (oracle version) Require: Conditional independence information among all variables in V, and an ordering order(V) on the variables; Algorithm 3.2 Adjacency search / Step 1 of the PC-algorithm (oracle version); Algorithm 4.1 Step 1 of the PC-stable algorithm (oracle version)
Open Source Code Yes All software is implemented in the R-package pcalg (Kalisch et al., 2012).
Open Datasets Yes In particular, we analyzed the yeast gene expression data set of Hughes et al. (2000).
Dataset Splits No The paper describes the generation of data for simulation studies (e.g., 'We generated 250 random weighted DAGs with p = 1000 and E(N) = 2, and for each weighted DAG we generated an i.i.d. sample of size n = 50') and the characteristics of the yeast gene expression data ('The observational data consist of gene expression levels of 5361 genes for 63 wild-type yeast organisms, and the experimental data consist of gene expression levels of the same 5361 genes for 234 single-gene knockout strains'). However, it does not explicitly provide information about how these datasets were split into training, testing, or validation sets for their experiments, or cite standard splits.
Hardware Specification Yes Run time in seconds (computed on an AMD Opteron(tm) Processor 6174 using R 2.15.1.)
Software Dependencies Yes Run time in seconds (computed on an AMD Opteron(tm) Processor 6174 using R 2.15.1.) of PC and PC-stable for the high-dimensional setting with p = 1000 and n = 50. All software is implemented in the R-package pcalg (Kalisch et al., 2012).
Experiment Setup Yes We estimated each graph for 20 random variable orderings, using the sample versions of (L)PC(-stable), (L)CPC(-stable), and (L)MPC(-stable) in the setting without latents, and the sample versions of RFCI(-stable), CRFCI(-stable), and MRFCI(-stable) in the setting with latents, with tuning parameter α {0.000625, 0.00125, 0.0025, 0.005, 0.01, 0.02, 0.04}.