Learning Linear Polytree Structural Equation Model
Authors: Xingmei Lou, Yu Hu, Xiaodong Li
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical findings are illustrated by comprehensive numerical simulations, and experiments on benchmark data also demonstrate the robustness of polytree learning when the true graphical structures can only be approximated by polytrees. |
| Researcher Affiliation | Academia | Xingmei Lou EMAIL Department of Statistics University of California, Davis Davis, CA 95616, USA Yu Hu EMAIL Department of Mathematics and Division of Life Science The Hong Kong University of Science and Technology Clear Water Bay, Hong Kong S.A.R. Xiaodong Li EMAIL Department of Statistics University of California, Davis Davis, CA 95616, USA |
| Pseudocode | Yes | Algorithm 1 Chow-Liu algorithm Algorithm 2 Extending the skeleton to a CPDAG Algorithm 3 Estimating the polytree skeleton by the simplified PC algorithm |
| Open Source Code | Yes | All codes are available at https://github.com/huyu00/linear-polytree-SEM. |
| Open Datasets | Yes | The ALARM dataset (Beinlich et al., 1989) is a widely used benchmark data. ... Another benchmark we test is the ASIA dataset (Lauritzen & Spiegelhalter, 1988), ... Lastly, we study a benchmark simulated dataset, EARTHQUAKE (Korb & Nicholson, 2010) |
| Dataset Splits | Yes | For each combination of SEM parameters, we randomly generate a polytree, the detailed generation of the βij s and ωii s are described in Section 7.3. Then we draw iid samples from the SEM of different sizes (the x-axis, n = 50, 100, 200, 400, 600, 800, 1000). This entire process is repeated 100 times. ... The accuracy measures ... are averaged over 1000 bootstraps (resampling n observations out of 100,000) and the standard deviations are in the parentheses. |
| Hardware Specification | Yes | All computation is done on a 2019 Intel i7 quad-core CPU desktop computer. |
| Software Dependencies | No | We use R implementations of the hill climbing and the PC algorithm from bnlearn and pcalg packages, respectively, along with all the default options and parameters. We implemented the polytree-adapted PC algorithm in Python. (Explanation: While software packages and languages are mentioned, specific version numbers for these dependencies are not provided, which is required for reproducibility.) |
| Experiment Setup | Yes | In all experiments, we set the threshold ρcrit (Algorithm 2) for rejecting a pair of nodes being independent based on the testing zero correlation for Gaussian distributions. Specifically, ρcrit = q (1 − 1/(1+t2 α/2/(n−2))) where tα/2 is the 1−α/2 quantile of a t-distribution with df = n−2, and we use α = 0.1. ... An α = 0.01 is used for the PC algorithm as recommended in Kalisch & Bühlman (2007). |