Counterfactual Learning with Multioutput Deep Kernels
Authors: Alberto Caron, Ioanna Manolopoulou, Gianluca Baio
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the use of the proposed methods on simulated experiments that span individual causal effects estimation, off-policy evaluation and optimization. In the first part of the work, we rely on Structural Causal Models (SCM) to formally introduce the setup and the problem of identifying counterfactual quantities under observed confounding. We then discuss the benefits of tackling the task of causal effects estimation via stacked coregionalized Gaussian Processes and Deep Kernels. Finally, we demonstrate the use of the proposed methods on simulated experiments that span individual causal effects estimation, off-policy evaluation and optimization. We evaluate the performance of counterfactual GPs and counterfactual DKL on a data generating process with three different tasks, and on a real-world example combining experimental and observational data. |
| Researcher Affiliation | Academia | Alberto Caron EMAIL Department of Statistical Science, University College London The Alan Turing Institute, London, UK Gianluca Baio EMAIL Department of Statistical Science, University College London Ioanna Manolopoulou EMAIL Department of Statistical Science, University College London |
| Pseudocode | No | The paper describes methods and architectures (e.g., Figure 3), but it does not contain any clearly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | Yes | We demonstrate the use of Counter DKL with simulated experiments on causal effects estimation, off-policy evaluation (OPE) and learning off-policy (OPL) problems (Dudík et al., 2011; Dudík et al., 2014; Farajtabar et al., 2018; Kallus, 2021), by providing also an Python implementation of the models, based on GPy Torch1. 1 Full code at: https://github.com/albicaron/Counter DKL |
| Open Datasets | Yes | We demonstrate the efficiency of Counter DKL also on a second experiment taken from Shalit et al. (2017), involving a popular real-world study on a job training program, dating back to La Londe (1986). Finally we compare Counter DKL with few other recent methods for causal effects estimation, on a popular simulated experiment utilizing the Infant Health Development Program (IHDP) data, originally found in Hill (2011), and more recently in several contributions on Conditional Average Treatment Effects (CATE) estimation. We make use of some of the popular datasets for classification in the open-source UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/index.php) |
| Dataset Splits | Yes | ICE: The first is the prediction of Individual Causal Effects (ICE). This tackles the estimation of the average causal effect of playing action Ai = a on outcome Yi, given a certain realization of the covariates space, Xi = x, i.e. the estimation of ICE: E(Yi|do(Ai = a), Xi = xi). This is carried out using a 80% training set, and evaluated via RMSE on a 20% left-out test set. Results on performance are gathered in Table 1, in terms of 70%-30% train and test set Mean Absolute Error (MAE) on ATT, Policy Risk Rpol and average runtime, accompanied by 10-fold cross-validated 95% error intervals. Results reported in Table 2 refers to 1000 replication of the experiment on 80%-20% train-test split as in Alaa & van der Schaar (2017). |
| Hardware Specification | Yes | All experiments were run on a Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 8Gb RAM CPU. |
| Software Dependencies | No | The paper mentions 'Python implementation', 'GPy Torch', and 'Adam solver'. However, it does not provide specific version numbers for these software components in the main text to ensure reproducibility. |
| Experiment Setup | Yes | DKL models employed a three [50, 50, 2] hidden layers feedforward neural network before the GP -layer, which itself employs a RBF base kernel. The multitask and multioutput models (both GPs and DKLs) all make use of the Intrinsic Coregionalization Model (ICM), such that K(xi, x i) = BY BA Kq(xi, x i). All model were optimized through the Adam solver. The Auto Encoder deep structure employed for the Auto Enc + GP" and Auto Enc + Counter GP" models similarly learns a 2-dimensional encoded lower-dimensional representation, where the encoder has two [10, 5] hidden layers before the 2-dim representation and the decoder has [5, 10] hidden layers before the reconstruction loss. our Counterfactual DKL (Counter DKL) with [100, 100, 2] hidden layers. |