Learning causal graphs via nonlinear sufficient dimension reduction

Authors: Eftychia Solea, Bing Li, Kyongwon Kim

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our methodology through simulations and a real data analysis. In this section, we evaluate the performance of our DAG estimator, referred to as the DAG-PC algorithm, through simulation comparisons with other methods and a data application.
Researcher Affiliation Academia Eftychia Solea EMAIL School of Mathematical Sciences, Queen Mary University of London Mile End, E1 4NS, London, UK Bing Li EMAIL Department of Statistics, Pennsylvania State University 326 Thomas Building, University Park, PA 16802, US Kyongwon Kim EMAIL Department of Applied Statistics Department of Statistics and Data Science Yonsei University 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, South Korea
Pseudocode Yes We display the new version of the PC-algorithm in the Algorithm 1 in the form of pseudo-codes. Algorithm 1 below describes only the first part of the DAG-PC algorithm that identifies the skeleton of the DAG.
Open Source Code No Methods A and B were implemented using the pcalg package (Kalisch et al., 2012) in R, while for Method C, we used the kpcalg package (Verbyla et al., 2017) in R. (This refers to *other* methods' code, not their own). There is no explicit statement about releasing their own code.
Open Datasets Yes We apply our method to the flow cytometry dataset (Sachs et al., 2005)... This dataset can be downloaded from https : //github.com/fernando Palluzzi/SEMgraph.
Dataset Splits No The paper mentions generating data with specific sample sizes (n=100, 150, 200) for simulations and using n=90 observations for a real data application. It describes subsampling for repeated evaluations in the data application, but it does not specify explicit training/test/validation splits with percentages, counts, or a detailed methodology for partitioning datasets to train and evaluate models in the conventional machine learning sense. The PC algorithm typically infers directly from the given dataset.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory) used for running experiments or simulations are mentioned in the paper.
Software Dependencies No The paper mentions specific R packages ('pcalg' and 'kpcalg') used to implement *comparison* methods (Method A, B, and C). However, it does not provide any specific, versioned software dependencies for the implementation of *their own* proposed methodology (Method D, DAG-PC algorithm).
Experiment Setup Yes In this subsection, we propose the tuning procedures involved in various steps in our method. Specifically, for step 1, the tuning parameters include the kernel parameters κXi, i = 1, . . . , p, the Tychonoff regularization tuning constants ηn and ϵn, and the dimension d ij S of the sufficient predictor ˆU ij,S. For step 2, the tuning parameters include the number of leading eigenvalues of GXi, ri, the Tychonoff regularization parameter for the CCCO, δn, and the thresholding constant ρn in the estimation of the skeleton in (16)... For this reason, we present only the results for d ij S = 1.