Local Identifying Causal Relations in the Presence of Latent Variables

Authors: Zheng Li, Zeyu Liu, Feng Xie, Hao Zhang, Chunchen Liu, Zhi Geng

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on benchmark networks and two real-world datasets further validate the effectiveness and efficiency of our method.
Researcher Affiliation Collaboration 1Department of Applied Statistics, Beijing Technology and Business University, Beijing, China 2SIAT, Chinese Academy of Sciences, Shenzhen, China 3Ling Yang Co.Ltd, Alibaba Group, Hangzhou, China.
Pseudocode Yes Algorithm 1 Local Learning Conditional Sets Algorithm 2 Loc ICR
Open Source Code Yes Our source code is available at https://github.com/zhengli0060/Loc ICR.
Open Datasets Yes We use four benchmark networks varying dimensionality: MILDEW, ALARM, WIN95PTS, and ANDES... Details of these networks can be found at https://www.bnlearn.com/bnrepository/. General Social Survey Data. ... available online https://gss.norc.org/us/en/gss.html. Gene Expression Data. We applied our proposed method to the gene expression dataset from Wille et al. (2004)
Dataset Splits No For each network, 100 datasets were randomly generated, with latent variables randomly selected for each dataset. Two observed variables were then randomly chosen as the target pair (X, Y ) for each dataset.
Hardware Specification Yes All experiments were performed with an Intel 2.70GHz CPU and 64 GB of memory.
Software Dependencies No For the LV-IDA algorithm, we utilized the R implementation available at https://github.com/dmalinsk/lv-ida, along with the RFCI and PC algorithms provided in the R package pcalg (Kalisch et al., 2012). The ICD algorithm was implemented using the Python code from https://github.com/Intel Labs/causality-lab, while the M3HC algorithm was implemented in MATLAB using the repository at https://github.com/mensxmachina/M3HC.
Experiment Setup Yes the benchmark networks are parameterized as linear Gaussian structural causal models, with the causal strengths chosen uniformly at random from the range (0.5, 1), and noises drawn from the standard Gaussian distribution. The number of latent variables is set to 4, 4, 6, and 10 for the respective network.