Differentiable Structure Learning with Ancestral Constraints

Authors: Taiyu Ban, Changxin Rong, Xiangyu Wang, Lyuzhou Chen, Xin Wang, Derui Lyu, Qinrui Zhu, Huanhuan Chen

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present partial experimental results and analysis here, with the full results available in the appendix. Synthetic Data. Random DAGs are generated using Erd os-R enyi (ER) and scale-free (SF) models with node degrees in {2, 4} and numbers of nodes d in {10, 20, 30, 50}. For observational data, uniformly random weights are assigned to the weighted adjacency matrix B. Given B, samples are generated using the structural equation model X = BT X + z, X Rd, with noise models {Gaussian (gauss), Exponential (exp)}. The sample size is n = {20, 1000}. Real-World Data. We use real-world protein and phospholipid expression data from (Sachs et al., 2005), which measures interactions in human immune system cells. This dataset is a standard benchmark in graphical modeling due to its consensus network of 11 nodes and 17 edges, validated through experimental annotations. Methods. The backbone algorithms include NOTEARS (Zheng et al., 2018; 2020), DAGMA (Bello et al., 2022), and GOLEM (Ng et al., 2022). Path existence (PE)-based structure learning is denoted as PE-alg , where alg represents the backbone algorithm used. Metrics. We evaluate performance using structural Hamming distance (SHD), true positive rate (TPR), false discovery rate (FDR), F1 score, and path recovery rate (satisfied constraints / total constraints). Setup. Default parameters: edge threshold ϵ0 = 0.3, path threshold ϵ = 10, L1 weight λ = 0.1, path existence weight γ = 1, path percentage q = 80. Other parameters follow defaults of backbone algorithms. Experiments run on an AMD Ryzen 9 7950X (4.5 GHz) CPU, NVIDIA RTX 3090 GPU, and 32 GB RAM. Figure 1. Comparison and ablation results of main metrics. Table 1. Results of our method on the real-world Sach dataset with various sample sizes and PE constraint numbers.
Researcher Affiliation Academia 1School of Computer Science and Technology, University of Science and Technology of China, Hefei, China. Correspondence to: Xiangyu Wang <EMAIL>, Huanhuan Chen <EMAIL>.
Pseudocode Yes Algorithm 1 Differentiable Structure Learning with Path Existence Constraints
Open Source Code No The paper does not explicitly provide an open-source code link or state that the code for the described methodology is being released.
Open Datasets Yes We use real-world protein and phospholipid expression data from (Sachs et al., 2005), which measures interactions in human immune system cells.
Dataset Splits No The paper mentions varying the total sample size (n) for experiments and generating synthetic data, but it does not specify explicit training/test/validation dataset splits (percentages or exact counts) for model training or evaluation on either synthetic or real-world data.
Hardware Specification Yes Experiments run on an AMD Ryzen 9 7950X (4.5 GHz) CPU, NVIDIA RTX 3090 GPU, and 32 GB RAM.
Software Dependencies No The paper mentions using backbone algorithms such as NOTEARS, DAGMA, and GOLEM but does not provide specific version numbers for these or other software libraries/environments.
Experiment Setup Yes Default parameters: edge threshold ϵ0 = 0.3, path threshold ϵ = 10, L1 weight λ = 0.1, path existence weight γ = 1, path percentage q = 80. Other parameters follow defaults of backbone algorithms.