Toward Falsifying Causal Graphs Using a Permutation-Based Test
Authors: Elias Eulig, Atalanti A. Mastakouri, Patrick Blöbaum, Michaela Hardt, Dominik Janzing
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluating on both simulated and real data sets from various domains, including biology and cloud monitoring, we demonstrate that the true graph is not falsified by our metric, whereas the wrong graphs given by a hypothetical user are likely to be falsified. |
| Researcher Affiliation | Collaboration | 1German Cancer Research Center (DKFZ) 2Heidelberg University 3Amazon Research T ubingen 4University Hospital T ubingen EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes a permutation test and related concepts in text, but does not include a clearly labeled pseudocode block or algorithm section. |
| Open Source Code | Yes | An implementation of our metric is available in the Python package Do Why (Bl obaum et al. 2024). Project https://eeulig.github.io/dag-falsification |
| Open Datasets | Yes | Protein Signaling Network (Sachs et al. 2005) This open dataset contains quantitative measurements... Auto MPG (Quinlan 1993) The Auto MPG dataset contains eight attributes... |
| Dataset Splits | No | The paper mentions using N=10^3 observations for synthetic data, and specific N values for real-world datasets (e.g., N=853, N=398, N=432), but does not specify any training, validation, or test splits for these datasets. |
| Hardware Specification | No | The paper includes a runtime table (Table 3) for different graph sizes but does not specify the CPU, GPU, or any other hardware components used for these measurements. |
| Software Dependencies | No | The paper mentions software like the 'Python package Do Why' and 'The R package dagitty' and algorithms like 'GCM with boosted decision trees', 'Li NGAM', 'CAM', and 'NOTEARS', but does not provide specific version numbers for any of these software components or libraries. |
| Experiment Setup | Yes | For all experiments on synthetic data we sample T = 103 node permutations and use datasets with N = 103 observations. To investigate the effect of N and T on p LMC we run ablation studies on nonlinear data with N, T ∈ {101, 102, 103, 104}. For = 5% we reject the hypotheses that the graphs are as bad as random ones. |