Causal Aggregation: Estimation and Inference of Causal Effects by Constraint-Based Data Fusion

Authors: Jaime Roquero Gimenez, Dominik Rothenhäusler

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of the proposed method on simulated and semi-synthetic data. Keywords: causal inference, structural equation models, data fusion, randomized experiments. ... We run simulations based on synthetic and semi-synthetic data.
Researcher Affiliation Academia Jaime Roquero Gimenez EMAIL Department of Statistics Stanford University Stanford, CA 94305, USA Dominik Rothenhäusler EMAIL Department of Statistics Stanford University Stanford, CA 94305, USA
Pseudocode Yes Algorithm 1: Non-linear Causal Aggregation via Backfitting ... Algorithm 2: Causal Aggregation Boosting
Open Source Code No The paper does not contain any explicit statements about releasing code, nor does it provide links to a code repository.
Open Datasets Yes We finally validate our procedure on semi-synthetic data set that we create from a single-cell RNA sequencing (sc RNA-seq) data set published in Gasperini et al. (2019).
Dataset Splits No The paper mentions using a "test set for evaluating the performance of our method" and "randomly partitioning the initial data set samples in three smaller datasets" to create environments, but it does not specify explicit percentages or sample counts for standard training, validation, or test splits for model evaluation.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions the "R-package causaleffect" but does not specify its version number or any other software dependencies with version information.
Experiment Setup Yes Algorithm 2: Causal Aggregation Boosting Initialize ˆfe = 0, ˆf = P e ˆfe, set η learning rate, ν penalty weight, set δ0 > 0 convergence threshold, δ = 2 δ0 current update gap. ... Finally, we empirically observe that shrinking the updates by a learning rate 1 > η > 0 improves convergence, so our final proposed update is given by: ... hyper-parameters such as the learning rate and the penalty weight ν can be chosen by keeping a separate validation data set within each environment and evaluating the unpenalized orthogonality constraint loss in the minimization problem (29).