reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Differentiable Structure Learning with Ancestral Constraints

Authors: Taiyu Ban, Changxin Rong, Xiangyu Wang, Lyuzhou Chen, Xin Wang, Derui Lyu, Qinrui Zhu, Huanhuan Chen

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present partial experimental results and analysis here, with the full results available in the appendix. Synthetic Data. Random DAGs are generated using Erd os-R enyi (ER) and scale-free (SF) models with node degrees in {2, 4} and numbers of nodes d in {10, 20, 30, 50}. For observational data, uniformly random weights are assigned to the weighted adjacency matrix B. Given B, samples are generated using the structural equation model X = BT X + z, X Rd, with noise models {Gaussian (gauss), Exponential (exp)}. The sample size is n = {20, 1000}. Real-World Data. We use real-world protein and phospholipid expression data from (Sachs et al., 2005), which measures interactions in human immune system cells. This dataset is a standard benchmark in graphical modeling due to its consensus network of 11 nodes and 17 edges, validated through experimental annotations. Methods. The backbone algorithms include NOTEARS (Zheng et al., 2018; 2020), DAGMA (Bello et al., 2022), and GOLEM (Ng et al., 2022). Path existence (PE)-based structure learning is denoted as PE-alg , where alg represents the backbone algorithm used. Metrics. We evaluate performance using structural Hamming distance (SHD), true positive rate (TPR), false discovery rate (FDR), F1 score, and path recovery rate (satisfied constraints / total constraints). Setup. Default parameters: edge threshold ϵ0 = 0.3, path threshold ϵ = 10, L1 weight λ = 0.1, path existence weight γ = 1, path percentage q = 80. Other parameters follow defaults of backbone algorithms. Experiments run on an AMD Ryzen 9 7950X (4.5 GHz) CPU, NVIDIA RTX 3090 GPU, and 32 GB RAM. Figure 1. Comparison and ablation results of main metrics. Table 1. Results of our method on the real-world Sach dataset with various sample sizes and PE constraint numbers.
Researcher Affiliation	Academia	1School of Computer Science and Technology, University of Science and Technology of China, Hefei, China. Correspondence to: Xiangyu Wang <EMAIL>, Huanhuan Chen <EMAIL>.
Pseudocode	Yes	Algorithm 1 Differentiable Structure Learning with Path Existence Constraints
Open Source Code	No	The paper does not explicitly provide an open-source code link or state that the code for the described methodology is being released.
Open Datasets	Yes	We use real-world protein and phospholipid expression data from (Sachs et al., 2005), which measures interactions in human immune system cells.
Dataset Splits	No	The paper mentions varying the total sample size (n) for experiments and generating synthetic data, but it does not specify explicit training/test/validation dataset splits (percentages or exact counts) for model training or evaluation on either synthetic or real-world data.
Hardware Specification	Yes	Experiments run on an AMD Ryzen 9 7950X (4.5 GHz) CPU, NVIDIA RTX 3090 GPU, and 32 GB RAM.
Software Dependencies	No	The paper mentions using backbone algorithms such as NOTEARS, DAGMA, and GOLEM but does not provide specific version numbers for these or other software libraries/environments.
Experiment Setup	Yes	Default parameters: edge threshold ϵ0 = 0.3, path threshold ϵ = 10, L1 weight λ = 0.1, path existence weight γ = 1, path percentage q = 80. Other parameters follow defaults of backbone algorithms.