Order-Independent Constraint-Based Causal Structure Learning
Authors: Diego Colombo, Marloes H. Maathuis
JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare the PC-, FCI-, and RFCI-algorithms and their modifications in simulation studies and on a yeast gene expression data set. We show that our modifications yield similar performance in low-dimensional settings and improved performance in high-dimensional settings. |
| Researcher Affiliation | Academia | Diego Colombo EMAIL Marloes H. Maathuis EMAIL Seminar for Statistics ETH Zurich 8092 Zurich, Switzerland |
| Pseudocode | Yes | Algorithm 3.1 The PC-algorithm (oracle version) Require: Conditional independence information among all variables in V, and an ordering order(V) on the variables; Algorithm 3.2 Adjacency search / Step 1 of the PC-algorithm (oracle version); Algorithm 4.1 Step 1 of the PC-stable algorithm (oracle version) |
| Open Source Code | Yes | All software is implemented in the R-package pcalg (Kalisch et al., 2012). |
| Open Datasets | Yes | In particular, we analyzed the yeast gene expression data set of Hughes et al. (2000). |
| Dataset Splits | No | The paper describes the generation of data for simulation studies (e.g., 'We generated 250 random weighted DAGs with p = 1000 and E(N) = 2, and for each weighted DAG we generated an i.i.d. sample of size n = 50') and the characteristics of the yeast gene expression data ('The observational data consist of gene expression levels of 5361 genes for 63 wild-type yeast organisms, and the experimental data consist of gene expression levels of the same 5361 genes for 234 single-gene knockout strains'). However, it does not explicitly provide information about how these datasets were split into training, testing, or validation sets for their experiments, or cite standard splits. |
| Hardware Specification | Yes | Run time in seconds (computed on an AMD Opteron(tm) Processor 6174 using R 2.15.1.) |
| Software Dependencies | Yes | Run time in seconds (computed on an AMD Opteron(tm) Processor 6174 using R 2.15.1.) of PC and PC-stable for the high-dimensional setting with p = 1000 and n = 50. All software is implemented in the R-package pcalg (Kalisch et al., 2012). |
| Experiment Setup | Yes | We estimated each graph for 20 random variable orderings, using the sample versions of (L)PC(-stable), (L)CPC(-stable), and (L)MPC(-stable) in the setting without latents, and the sample versions of RFCI(-stable), CRFCI(-stable), and MRFCI(-stable) in the setting with latents, with tuning parameter α {0.000625, 0.00125, 0.0025, 0.005, 0.01, 0.02, 0.04}. |