Inference for a Large Directed Acyclic Graph with Unspecified Interventions

Authors: Chunlin Li, Xiaotong Shen, Wei Pan

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical examples demonstrate the utility and effectiveness of the proposed methods, including an application to infer gene regulatory networks. The numerical studies and real data analysis demonstrate the utility and effectiveness of the proposed methods. The paper includes sections titled '5. Simulations' and '6. ADNI Data Analysis' which describe empirical evaluation.
Researcher Affiliation Academia Chunlin Li EMAIL School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA; Xiaotong Shen EMAIL School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA; Wei Pan EMAIL Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA. All authors are affiliated with the University of Minnesota, which is an academic institution.
Pseudocode Yes The paper includes 'Algorithm 1: Constrained estimation via DC program + ℓ0 projection', 'Algorithm 2: Reconstruction of ARG by peeling', and 'Algorithm 3: DP likelihood ratio test'.
Open Source Code Yes The R implementation is available at https://github.com/chunlinli/intdag. The implementation of the proposed tests and structure learning method is available at https://github.com/chunlinli/intdag.
Open Datasets Yes The raw data are available in the ADNI database (https://adni.loni.usc.edu), including gene expression, whole-genome sequencing, and phenotypic data. From the KEGG database (Kanehisa and Goto, 2000), we extract the AD reference pathway (hsa05010, https: //www.genome.jp/pathway/hsa05010). The Alz Gene database (alzgene.org) and the Alz Net database (https://mips.helmholtz-muenchen.de/Alz Net-DB) are also referenced.
Dataset Splits Yes For our purpose, we treat 247 CN individuals as controls while the remaining 465 individuals as cases (AD-MCI).
Hardware Specification No No specific hardware details such as GPU/CPU models, processor types, or memory amounts are mentioned in the paper for running the experiments or analyses.
Software Dependencies No The paper mentions 'The R implementation is available at https://github.com/chunlinli/intdag.' and 'For 2SPLS, we use the R package Big SEM.' However, it does not specify version numbers for R or the Big SEM package.
Experiment Setup Yes In simulations, we consider two setups for generating U Rp p, representing random and hub DAGs, respectively. For Algorithm 3, we fix the Monte Carlo sample size M = 500. For Algorithms 1 and 2, we choose τj {0.05, 0.1, 0.15} and γj = exp(γ j) with γ j n log(max l,j |X l Y j|), . . . , 0.05 log(max l,j |X l Y j|) o (100 equally spaced values). Then BIC is used to estimate tuning parameters κj {1, . . . , 30}; j = 1, . . . , p. For DP inference, we use α = 0.05 and choose the tuning parameters by BIC as in previous experiments.