reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal Inference through a Witness Protection Program

Authors: Ricardo Silva, Robin Evans

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Section 9 contains experiments with synthetic and real data. In Section 9, we provide evidence for this claim. We simulate 100 data sets for each one of the four cases (hard case/easy case, with theoretical solution/without theoretical solution), 5000 points per data set, 1000 Monte Carlo samples per decision.
Researcher Affiliation	Academia	Ricardo Silva EMAIL Department of Statistical Science and CSML University College London London WC1E 6BT, UK Robin Evans EMAIL Department of Statistics University of Oxford Oxford OX1 3TG, UK
Pseudocode	Yes	Algorithm 1: A simpliﬁed Witness Protection Program algorithm, assuming the observable distribution P(W, X, Y ) is known. Algorithm 2: The outline of the Witness Protection Program algorithm. Algorithm 3: The iterative back-substitution procedure for bounding Lxw ωxw Uxw for all combinations of x and w in {0, 1}2.
Open Source Code	Yes	Ongoing updates of software for WPP is provided as part of the R package Causal FX, available at the Comprehensive R Network27 and Git Hub28. A snapshot of the code used in this paper is available at http://www.homepages.ucl.ac.uk/~ucgtrbd/wpp.
Open Datasets	Yes	Our empirical study concerns the eﬀect of inﬂuenza vaccination on a patient being later on hospitalized with chest problems. ... The study was originally discussed by Mc Donald et al. (1992). ... We performed an empirical study with the 1976 Panel Study of Income Dynamics. ... The data was discussed by Mroz (1987) and can be obtained from the R package AER (Kleiber and Zeileis, 2008).
Dataset Splits	No	The paper mentions overall sample sizes for datasets: '5000 points per data set' for synthetic studies, '2, 681 patients' for the influenza study, and 'sample size is 753' for the income study. It also mentions '1000 Monte Carlo samples per decision' as part of the experimental setup. However, it does not provide explicit training, testing, or validation splits for these datasets.
Hardware Specification	Yes	Experiments were run on an Intel Xeon E5-1650 at 3.20Ghz.
Software Dependencies	No	The paper mentions several R packages used, such as 'rcdd', 'huge', 'sbgcop', and 'AER', and that 'All code was written in R'. However, it does not specify concrete version numbers for any of these software components or the R environment itself.
Experiment Setup	Yes	In the ﬁrst batch, we set ϵx = ϵy = ϵw = 0.2, and β = 0.9, β = 1.1. In the second batch, we change parameters so that β = β = 1. We simulate 100 data sets for each one of the four cases (hard case/easy case, with theoretical solution/without theoretical solution), 5000 points per data set, 1000 Monte Carlo samples per decision.