Falsification of Unconfoundedness by Testing Independence of Causal Mechanisms
Authors: Rickard Karlsson, Jh Krijthe
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To showcase the practical relevance of our approach, we show that our method is able to efficiently detect confounding on both simulated and semi-synthetic data. 6. Experiments We conducted a series of experiments to compare the proposed MINT algorithm with alternative baseline approaches. First, we investigate efficiency with respect to number of samples and number of environments. Next, we validate our theoretical findings by investigating necessary mechanism changes that allow for falsification. We then assessed the sensitivity of our algorithm to (mis)specification in its working models. Lastly, we evaluated all methods under more realistic conditions using semi-synthetic data based on the real-world Twins dataset (Almond et al., 2005), which includes birth data across different geographical locations used as environment labels. |
| Researcher Affiliation | Academia | 1Department of Intelligent Systems, Delft University of Technology, the Netherlands. Correspondence to: Rickard Karlsson <EMAIL>. |
| Pseudocode | Yes | 5. Algorithm We now introduce the Mechanism INdependent Test (MINT) algorithm, which operationalizes our falsification strategy for testing mechanism independence using data from multiple environments. We will use the following notation: for all environments s = 1, . . . , K, we denote the observed data matrices as As = [A1, . . . , Ans] , Ys = [Y1, . . . , Yns] , eΨs = [ eψ(X1), . . . , eψ(Xns)] , and eΦs = [eϕ(X1, A1), . . . , eϕ(Xns, Ans)] . The MINT algorithm can be divided into two steps: In the first stage, for all s = 1, . . . , K, we estimate the parameters (ωs, γs). The estimates are obtained through solving the least-squares problems bωs = arg minω ||As eΨsω||2 2 and bγs = arg minγ ||Ys eΦsγ||2 2 where || ||2 2 denotes the l2-norm. We denote all estimated parameters as bω = [bω1, . . . , bωK] and bγ = [bγ1, . . . , bγK]. In the second stage, we perform a statistical independence test for the null hypothesis H0 : P(ω, γ) = P(ω)P(γ) using the estimated parameters bω and bγ. |
| Open Source Code | Yes | The code for reproducing our experiments is available at our Git Hub repository.1 https://github.com/RickardKarl/falsification-unconfoundedness |
| Open Datasets | Yes | In the final experiment, we used data from twin births in the USA between 1989-1991 (Almond et al., 2005) to construct a multi-environment observational dataset with a known causal structure. |
| Dataset Splits | No | The paper describes how synthetic and semi-synthetic data were generated and used in experiments, including parameters like 'N samples per environment' and 'K environments'. However, it does not specify explicit training, validation, or test dataset splits in the conventional sense for model training and evaluation. |
| Hardware Specification | No | The paper mentions that research was "facilitated by the computational resources and support of the Delft AI Cluster (DAIC) at TU Delft." However, it does not provide specific details such as GPU models, CPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions using specific software such as the 'Pearson partial correlation test', the 'non-parametric kernel conditional independence test (KCIT)', and states that they adopted 'the KCIT implementation from the causallearn Python package (Zheng et al., 2024)'. However, it does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | We measured performance using the falsification rate (probability of falsification) and set the significance level α = 0.05 to control Type 1 errors. We used K = 100 environments with N = 50 samples per environment and d = 1 observed confounder, and set the polynomial degree to p = 2. The noise variables εA and εY were mean-zero Normal distributed with their standard deviation set to 0.5. |