reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal Aggregation: Estimation and Inference of Causal Effects by Constraint-Based Data Fusion

Authors: Jaime Roquero Gimenez, Dominik Rothenhäusler

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the eﬀectiveness of the proposed method on simulated and semi-synthetic data. Keywords: causal inference, structural equation models, data fusion, randomized experiments. ... We run simulations based on synthetic and semi-synthetic data.
Researcher Affiliation	Academia	Jaime Roquero Gimenez EMAIL Department of Statistics Stanford University Stanford, CA 94305, USA Dominik Rothenhäusler EMAIL Department of Statistics Stanford University Stanford, CA 94305, USA
Pseudocode	Yes	Algorithm 1: Non-linear Causal Aggregation via Backﬁtting ... Algorithm 2: Causal Aggregation Boosting
Open Source Code	No	The paper does not contain any explicit statements about releasing code, nor does it provide links to a code repository.
Open Datasets	Yes	We ﬁnally validate our procedure on semi-synthetic data set that we create from a single-cell RNA sequencing (sc RNA-seq) data set published in Gasperini et al. (2019).
Dataset Splits	No	The paper mentions using a "test set for evaluating the performance of our method" and "randomly partitioning the initial data set samples in three smaller datasets" to create environments, but it does not specify explicit percentages or sample counts for standard training, validation, or test splits for model evaluation.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions the "R-package causaleffect" but does not specify its version number or any other software dependencies with version information.
Experiment Setup	Yes	Algorithm 2: Causal Aggregation Boosting Initialize ˆfe = 0, ˆf = P e ˆfe, set η learning rate, ν penalty weight, set δ0 > 0 convergence threshold, δ = 2 δ0 current update gap. ... Finally, we empirically observe that shrinking the updates by a learning rate 1 > η > 0 improves convergence, so our ﬁnal proposed update is given by: ... hyper-parameters such as the learning rate and the penalty weight ν can be chosen by keeping a separate validation data set within each environment and evaluating the unpenalized orthogonality constraint loss in the minimization problem (29).