Causal Feature Selection via Orthogonal Search

Authors: Ashkan Soleymani, Anant Raj, Stefan Bauer, Bernhard Schölkopf, Michel Besserve

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To showcase performance of our algorithm, we conducted two sets of experiments: i) Comparison with causal structure learning methods (Casual and Markov Blanket discovery) using data consisted of DAGs with high number of observations-to-number of variables ratio (n d) which is applicable to causal structure learning methods. ... ii) Comparison with inference by regression methods using data consisted of DAGs with high number of observations-to-number of variables ratio (n d and n d) to illustrate performance in high-dimensional regimes. ... Evaluation: Recall, Fall-out, Critical Success Index, Accuracy, F1 Score, and Matthews correlation coefficient (Matthews, 1975) are considered as metrics for the evaluation. These metrics are described in Appendix C.2. Results: Our method performs better than the competing baselines in terms of accuracy and F1 score, especially for more connected structures... Figure 3: Overall performance for a single random DAG with 100 simulations for each setting...
Researcher Affiliation Academia Ashkan Soleymani* EMAIL Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology, Cambridge, US; Anant Raj* EMAIL Inria, Ecole Normale Supérieure PSL Research University, Paris, France; Stefan Bauer EMAIL KTH, Stockholm, Sweden; Bernhard Schölkopf EMAIL Max Planck Institute for Intelligent Systems Tübingen, Germany; Michel Besserve EMAIL Max Planck Institute for Intelligent Systems Tübingen, Germany
Pseudocode Yes Pseudo-code for our proposed method (CORTH Features) is in Algorithm 1. ... Pseudo-code for the entire procedure is given below in Algorithm 1. Algorithm 1 Efficient Causal Orthogonal Structure Search (CORTH Features) 1: Input: response Y RN, covariates X RN d, significance level α, number of partitions K. 2: Split N observations into K random partitions, Ik for k = 1, . . . , K, each having n = N/K samples. 3: for i = 1, . . . , d do ...
Open Source Code Yes Code: The code for the method and data generation is available in this Git Hub repository.
Open Datasets Yes We also apply our algorithm to a recent COVID-19 Dataset (Einstein, 2020) where the task is to predict COVID-19 cases (confirmed using RT-PCR) amongst suspected ones. ... Hospital Israelita Albert Einstein. Diagnosis of COVID-19 and its clinical spectrum, 2020. https://www.kaggle.com/einsteindata4u/covid19.
Dataset Splits Yes We use 10-fold cross-validation to choose the parameters of all approaches.
Hardware Specification No The paper discusses computational complexity and runtime, but does not specify any particular hardware used for experiments (e.g., specific CPU, GPU, or cloud instance models).
Software Dependencies No The "Compare Causal Networks"4 and "bnlearn: Bayesian Network Structure Learning, Parameter Learning and Inference"5 R Packages are used to run most of the baselines methods. ... "selective Inference: Tools for Post-Selection Inference" R Package 6 is leveraged to run most of these baselines. ... We use Lasso as the estimator of conditional expectation for our method... The paper mentions software packages (R Packages for various methods, Lasso) but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes For every combination of number of nodes (#nodes), connectivity (ps), noise level (σ2), number of observations (n), and non-linear probability (pn) ... 100 examples (DAGs) are generated... non-linear transformation used is a tanh(bx), with a = 0.5 and b = 1.5. If the set of parents for the node X is denoted as PAX as before then value assignment for a node X is as follow: X PAX ιℓ(pn)θX + (1 ιℓ(pn)).a. tanh(b X), where ε N(0, σ2)... The value of θ is set to 2 for the small DAGs (number of nodes equal to 5 or 10) and 0.5 for large DAGs... We use 10-fold cross-validation to choose the parameters of all approaches. ... We used cross-validation to choose hyperparameters and confidence level for hypothesis testing considered is 90%. We use Lasso as the estimator of conditional expectation...