reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient and Trustworthy Causal Discovery with Latent Variables and Complex Relations

Authors: Xiuchuan Li, Tongliang Liu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We first use four causal graphs shown as Fig. 9 to generate synthetic data. For each graph, we draw 10 sample sets of size 2k, 5k, 10k respectively. Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0] and each noise is generated from the seventh power of uniform distribution. We compare our methods with GIN (Xie et al., 2020), La HME (Xie et al., 2022), and PO-Li NGAM (Jin et al., 2024). We use 3 metrics to evaluate the performance, including (i) Error in Latent Variables, the absolute difference between the estimated number of latent variables and the ground-truth one; (ii) Correct-Ordering Rate, the number of correctly estimated causal ordering divided by the number of causal ordering in the ground-truth graph; (iii) F1-Score of causal edges. The results are summarized in Tab. 1, where we also report the running time.
Researcher Affiliation	Academia	Xiu-Chuan Li Tongliang Liu Sydney AI Centre, University of Sydney Correspondence to Tongliang Liu (EMAIL).
Pseudocode	Yes	An overview of our algorithm in stage 1 is shown as Alg. 1 while a detailed version is deferred to Alg. 3 in App. E. It has O(𝑅\|O0\|3) complexity where 𝑅is the number of iterations. An overview of our algorithm in stage 2 is shown as Alg. 2 while a detailed version is deferred to Alg. 4 in App. E. It has O(\|V𝑎\|3) time complexity.
Open Source Code	Yes	Our code is available at: https://github.com/Xiuchuan Li/ICLR2025-ETCD
Open Datasets	No	We first use four causal graphs shown as Fig. 9 to generate synthetic data. For each graph, we draw 10 sample sets of size 2k, 5k, 10k respectively. Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0] and each noise is generated from the seventh power of uniform distribution. We also apply our proposed algorithm to real-world data, more details are deferred to App. D. (Appendix D refers to 'multitasking behavior model' but does not provide public access to the data.)
Dataset Splits	No	For each graph, we draw 10 sample sets of size 2k, 5k, 10k respectively. (This describes dataset sizes for synthetic data but no training/validation/test splits. The paper does not mention dataset splits for real-world data.)
Hardware Specification	No	The paper does not contain any specific hardware details such as GPU/CPU models, processor types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper does not contain any specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup	No	Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0] and each noise is generated from the seventh power of uniform distribution. (This describes the data generation process for synthetic data but not specific hyperparameters or training configurations for the proposed algorithm itself.)