Isolated Causal Effects of Natural Language
Authors: Victoria Lin, Louis-Philippe Morency, Eli Ben-Michael
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments1 demonstrate the validity of our framework on both semi-synthetic and real-world data. Using evaluation settings where the ground truth is known, we observe that our estimation framework is able to recover the true isolated effect across multiple interventions. |
| Researcher Affiliation | Academia | Victoria Lin 1 Louis-Philippe Morency 1 Eli Ben-Michael 1 1Carnegie Mellon University, Pittsburgh, PA, USA. Correspondence to: Victoria Lin <EMAIL>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It provides mathematical derivations in Appendix A but no algorithm pseudo-code. |
| Open Source Code | Yes | Our data and code are publicly available at https://github.com/torylin/isolated-text-effects. |
| Open Datasets | Yes | The Amazon dataset (Mc Auley & Leskovec, 2013) consists of reviews from the Amazon e-commerce site... The Sv T dataset (Dhawan et al., 2024) consists of posts from weight-loss communities on the social media site Reddit... |
| Dataset Splits | Yes | For each non-focal language representation ac s(X), we use 5-fold cross-fitting to train an outcome model bg to predict Y given ac s(X) and a classifier to predict a(X) given ac s(X). Within the training folds, we conduct 5-fold cross-validation to select model hyperparameters. |
| Hardware Specification | No | All experiments were conducted on consumer-level machines. Experiments involving language models, such as those with MPNet and Sente Con embeddings, were conducted using consumer-level NVIDIA GPUs. |
| Software Dependencies | Yes | To implement our lexicons, we use the third-party liwc Python library and the empath library released by its creators. Sente Con-LIWC and Sente Con-Empath representations are obtained using the sentecon library released by its creators. BERT and Ro BERT-a embeddings are obtained via the Hugging Face transformers library using the pre-trained models bert-base-uncased and roberta-base, respectively. MPNet and Mini LM embeddings are obtained via the Hugging Face sentence-transformers library using the pre-trained models all-mpnet-base-v2 and all-Mini LM-L6-v2, respectively. Finally, LLM (GPT-3.5) prompting covariates are taken directly from the Sv T dataset released by Dhawan et al. (2024). Additional technical details are provided in Table 2. ... All outcome models and a(X) classifiers are implemented using the scikit-learn Python library (version 1.3.0). |
| Experiment Setup | Yes | Gradient boosting models use a subsample proportion of 0.7, i.e., 70% of training samples are used to fit the individual base learners. Neural networks used for outcome models in the nonlinear Amazon setting are implemented with the MLPRegressor class and tuned over the following possible layer counts and sizes: (128,), (128, 128), (128, 256, 128). Logistic and linear regression models are optimized for L1 ratio over the range [0.0, 0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1.0], where 1.0 corresponds to L1 penalty only and 0.0 corresponds to L2 penalty only. Logistic regression models are further tuned for C (inverse regularization strength) over the following search space: [0.001, 0.01, 0.1, 1.0, 10, 100]. For all interventions, the optimal hyperparameters are a linear regression L1 ratio of 0.5, logistic regression L1 ratio of 0.0, and C of 0.001. |