Adjustment for Confounding using Pre-Trained Representations
Authors: Rickmer Schulte, David Rügamer, Thomas Nagler
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the following, we will complement our theoretical results from the previous section with empirical evidence from several experiments. The experiments include both images and text as non-tabular data, which act as the source of confounding in the ATE setting. Further experiments can be found in Appendix D. |
| Researcher Affiliation | Academia | 1Department of Statistics, LMU Munich, Munich, Germany 2Munich Center for Machine Learning (MCML), Munich, Germany. Correspondence to: Rickmer Schulte <EMAIL>. |
| Pseudocode | No | The paper describes methods using mathematical formulations and natural language, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code to reproduce the results of the experiments can be found at https://github.com/rickmer-schulte/Pretrained-Causal-Adjust. |
| Open Datasets | Yes | Text Data We utilize the IMDb Movie Reviews dataset from Lhoest et al. (2021) consisting of 50,000 movie reviews labeled for sentiment analysis. [...] Image Data We further use the dataset from Kermany et al. (2018) that contains 5,863 chest X-ray images of children. |
| Dataset Splits | Yes | Generally, DML is used with sample splitting and with two folds for cross-validation. [...] After preprocessing and extraction of pre-trained representations, we sub-sampled 1,000 and 4,000 pre-trained representations for the two confounding setups to make the simulation study tractable. [...] The following experiment is based on 500 sampled images from the X-Ray dataset, where five-layer CNNs are used in the non-pre-trained DML version. [...] DML without pre-training (DML (CNN)) for ATE estimation using 500 (Left) and all 3769 (Right) images from the X-Ray dataset. |
| Hardware Specification | Yes | All computations were performed on a user PC with Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, 8 cores, and 16 GB RAM. |
| Software Dependencies | No | The paper mentions software components like 'scikit-learn', 'Causal ML (Chen et al., 2020)', 'Double ML (Bach et al., 2022)', and 'Adam for optimization', but it does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | In the Complex Confounding experiments, we also use neural network-based nuisance estimators for DML and the S-Learner. For this purpose, we employed neural networks with a depth of 100 and a width of 50 while using Re LU activation and Adam for optimization. [...] The experiment of Figure 7 uses a five-layer CNN with 3 × 3 convolutions, batch normalization, Re LU activation, and max pooling, followed by a model head consisting of fully connected layers with dropout. Training uses Adam optimization with early stopping. |