A Conditional Independence Test in the Presence of Discretization
Authors: Boyang Sun, Yu Yao, Guang-Yuan Hao, Qiu, Kun Zhang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Theoretical analysis, along with empirical validation on various datasets, rigorously demonstrates the effectiveness of our testing methods. We applied the proposed method DCT to synthetic data to evaluate its practical performance and compare it with Fisher-Z test... The experiments investigating its robustness, performance in denser graphs and effectiveness in a real-world dataset can be found in App. H. |
| Researcher Affiliation | Academia | 1 Mohamed bin Zayed University of Artificial Intelligence 2 Carnegie Mellon University 3 Peking University 4 The University of Sydney |
| Pseudocode | Yes | The pseudocode of DCT is provided in App. D. Algorithm 1 DCT: Discretization-Aware CI Test |
| Open Source Code | Yes | Our code implementation can be found in https:// github.com/boyangaaaaa/DCT. |
| Open Datasets | Yes | To further validate DCT, we employ it on a real-world dataset: Big Five Personality https://openpsychometrics.org/, which includes 50 personality indicators and over 19000 data samples. |
| Dataset Splits | No | The paper uses synthetic data generated under specific conditions to evaluate statistical tests and causal discovery algorithms. It describes data generation processes and evaluation metrics (e.g., Type I/II error, F1, SHD) but does not specify traditional training/test/validation dataset splits as would be common for supervised learning tasks. |
| Hardware Specification | Yes | All the experiments are run using Intel(R) Xeon(R) CPU E5-2680 v4 with 55 processors. |
| Software Dependencies | No | The paper mentions using "Causal-DAG (Chandler Squires, 2018)" and Python implicitly through its GitHub repository, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Our experiment investigates the variations in Type I and Type II error (1 minus power) probabilities under two conditions. In the first scenario, we focus on the effects of modifying the sample size, denoted as n = (100, 500, 1000, 2000), while conditioning on a single variable. In the second, the sample size is held constant at 2000, and we vary the cardinality of the conditioning set, represented as D = (1, 2, . . . , 5). ... We repeat each test 1500 times. The data are then discretized into K = (2, 4, 8, 12) levels, with boundaries randomly set based on the variable range. |