reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Conditional Independence Test in the Presence of Discretization

Authors: Boyang Sun, Yu Yao, Guang-Yuan Hao, Qiu, Kun Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Theoretical analysis, along with empirical validation on various datasets, rigorously demonstrates the effectiveness of our testing methods. We applied the proposed method DCT to synthetic data to evaluate its practical performance and compare it with Fisher-Z test... The experiments investigating its robustness, performance in denser graphs and effectiveness in a real-world dataset can be found in App. H.
Researcher Affiliation	Academia	1 Mohamed bin Zayed University of Artificial Intelligence 2 Carnegie Mellon University 3 Peking University 4 The University of Sydney
Pseudocode	Yes	The pseudocode of DCT is provided in App. D. Algorithm 1 DCT: Discretization-Aware CI Test
Open Source Code	Yes	Our code implementation can be found in https:// github.com/boyangaaaaa/DCT.
Open Datasets	Yes	To further validate DCT, we employ it on a real-world dataset: Big Five Personality https://openpsychometrics.org/, which includes 50 personality indicators and over 19000 data samples.
Dataset Splits	No	The paper uses synthetic data generated under specific conditions to evaluate statistical tests and causal discovery algorithms. It describes data generation processes and evaluation metrics (e.g., Type I/II error, F1, SHD) but does not specify traditional training/test/validation dataset splits as would be common for supervised learning tasks.
Hardware Specification	Yes	All the experiments are run using Intel(R) Xeon(R) CPU E5-2680 v4 with 55 processors.
Software Dependencies	No	The paper mentions using "Causal-DAG (Chandler Squires, 2018)" and Python implicitly through its GitHub repository, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Our experiment investigates the variations in Type I and Type II error (1 minus power) probabilities under two conditions. In the first scenario, we focus on the effects of modifying the sample size, denoted as n = (100, 500, 1000, 2000), while conditioning on a single variable. In the second, the sample size is held constant at 2000, and we vary the cardinality of the conditioning set, represented as D = (1, 2, . . . , 5). ... We repeat each test 1500 times. The data are then discretized into K = (2, 4, 8, 12) levels, with boundaries randomly set based on the variable range.