Gradient-based Causal Feature Selection

Authors: Zhaolong Ling, Mengxiang Guo, Xingyu Wu, Debo Cheng, Peng Zhou, Tianci Li, Zhangling Duan

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the effectiveness and accuracy of GCFS, we conducted experiments on various synthetic datasets generated by two distinct mechanisms as well as on real datasets. GCFS was compared against nine causal feature selection algorithms, including divide-and-conquer methods (MMMB, STMB [Gao and Ji, 2017]), simultaneous methods (FBED, EAMB), alternating PC-Spouse methods (BAMB [Ling et al., 2019], EEMB [Wang et al., 2020]), score-based methods (SLL, S2TMB), and mutual information-based method (CFSMI [Ling et al., 2022a]). We used standard evaluation metrics: Precision measures the proportion of true positives (TP) among all outputs. Recall is the ratio of TP to the total number of actual positives. The F1 Score is the harmonic mean of Precision and Recall, where F1 = 1 is the best case and F1 = 0 is the worst case. [Xie et al., 2024] CITs denote the number of conditional independence tests performed. Runtime refers to the algorithm s execution time. If an algorithm requires 1 hour for one run, its runtime is denoted as .
Researcher Affiliation Academia 1Anhui University 2Hong Kong Polytechnic University 3University of South Australia 4Institute of Artificial Intelligence, Hefei Comprehensive National Science Center EMAIL, ahu EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1 GCFS Require: x: data, XT : target variable, θ1, θ2: hyperparameters Φ1, Φ2, Q, λ, µ, ρ: initial parameters Ensure: [P, C, SP]: MB of XT 1: CMBT Search Phase(x, XT , Φ1, Φ2, Q, λ, µ, ρ) 2: Initialize Φ1, Φ2 of Auto Encoder for each Xk CMBT 3: for t = 1, 2, . . . do 4: Xpl(T ), M get MBGraph(A, Q) 5: Optimize Eq. (9) using Adam algorithm 6: Update parameters µ and ρ by Eq. (10) and Eq. (11); 7: if h(M A) θ1 or ρ θ2 then 8: break 9: end if 10: end for 11: [P, C] adjacency matrix A 12: for each c C do 13: SP Update adjacency matrix A 14: SP = SP \ {XT } 15: end for 16: return [P, C, SP]
Open Source Code Yes Corresponding author 1https://github.com/Mx Guoz/Appendix
Open Datasets Yes We conducted experiments on real biological datasets, specifically using the Sachs [Sachs et al., 2005] dataset. The dataset records the expression levels of proteins and phospholipids in human cells using multi-parameter single-cell technology, and its ground truth network consists of 11 nodes and 17 edges. As a commonly used benchmark in graphical models, the Sachs dataset has a known consensus network (based on experimentally annotated gold standard networks), making it widely accepted in the biological community.
Dataset Splits No The paper mentions data samples generated with sizes n {1000, 5000} and that each sample group was subjected to 10 independent experiments. However, it does not specify explicit training, validation, or test dataset splits, percentages, or sample counts for model evaluation or reproduction.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory specifications).
Software Dependencies No In practice, the parameters can be updated efficiently by using the Autograd feature of deep learning frameworks such as Tensor Flow with the Adam optimizer [Kinga et al., 2015].
Experiment Setup No Algorithm 1 mentions `θ1, θ2: hyperparameters Φ1, Φ2, Q, λ, µ, ρ: initial parameters` and references equations for updating `µ` and `ρ`, but concrete values for these hyperparameters or initial parameters are not provided in the main text. The paper also mentions using the Adam optimizer but does not specify its learning rate or other settings.