reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Local Identifying Causal Relations in the Presence of Latent Variables

Authors: Zheng Li, Zeyu Liu, Feng Xie, Hao Zhang, Chunchen Liu, Zhi Geng

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on benchmark networks and two real-world datasets further validate the effectiveness and efficiency of our method.
Researcher Affiliation	Collaboration	1Department of Applied Statistics, Beijing Technology and Business University, Beijing, China 2SIAT, Chinese Academy of Sciences, Shenzhen, China 3Ling Yang Co.Ltd, Alibaba Group, Hangzhou, China.
Pseudocode	Yes	Algorithm 1 Local Learning Conditional Sets Algorithm 2 Loc ICR
Open Source Code	Yes	Our source code is available at https://github.com/zhengli0060/Loc ICR.
Open Datasets	Yes	We use four benchmark networks varying dimensionality: MILDEW, ALARM, WIN95PTS, and ANDES... Details of these networks can be found at https://www.bnlearn.com/bnrepository/. General Social Survey Data. ... available online https://gss.norc.org/us/en/gss.html. Gene Expression Data. We applied our proposed method to the gene expression dataset from Wille et al. (2004)
Dataset Splits	No	For each network, 100 datasets were randomly generated, with latent variables randomly selected for each dataset. Two observed variables were then randomly chosen as the target pair (X, Y ) for each dataset.
Hardware Specification	Yes	All experiments were performed with an Intel 2.70GHz CPU and 64 GB of memory.
Software Dependencies	No	For the LV-IDA algorithm, we utilized the R implementation available at https://github.com/dmalinsk/lv-ida, along with the RFCI and PC algorithms provided in the R package pcalg (Kalisch et al., 2012). The ICD algorithm was implemented using the Python code from https://github.com/Intel Labs/causality-lab, while the M3HC algorithm was implemented in MATLAB using the repository at https://github.com/mensxmachina/M3HC.
Experiment Setup	Yes	the benchmark networks are parameterized as linear Gaussian structural causal models, with the causal strengths chosen uniformly at random from the range (0.5, 1), and noises drawn from the standard Gaussian distribution. The number of latent variables is set to 4, 4, 6, and 10 for the respective network.