reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Instrumental Variables with Structural and Non-Gaussianity Assumptions

Authors: Ricardo Silva, Shohei Shimizu

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present instrumental variable discovery methods that systematically characterize which set of causal eﬀects can and cannot be discovered under local graphical criteria that deﬁne instrumental variables, without reconstructing full causal graphs. We also introduce the ﬁrst methods to exploit non-Gaussianity assumptions, highlighting identiﬁability problems and solutions. ... In Section 5, we assess how IV-TETRAD+ compares to other methods in a series of simulations. We then provide an example on how the method can be used in an empirical study and assess its robustness.
Researcher Affiliation	Academia	Ricardo Silva EMAIL Department of Statistical Science and Centre for Computational Statistics and Machine Learning University College London, WC1E 6BT, UK The Alan Turing Institute, 96 Euston Road, London NW1 2DB, UK Shohei Shimizu EMAIL The Center for Data Science Education and Research Shiga University, 1-1-1 Banba Hikone, Shiga 522-8522, Japan The Institute of Scientiﬁc and Industrial Research, Osaka University, Japan RIKEN Center for Advanced Intelligence Project, Japan
Pseudocode	Yes	Algorithm 1 IV-BY-MAJORITY ... Algorithm 2 IV-TETRAD ... Algorithm 3 IV-TETRAD+ ... Algorithm 4 IV-TETRAD+ ... Algorithm 5 IV-TETRAD++
Open Source Code	Yes	Code for all experiments is available at http://www.homepages.ucl.ac.uk/~ucgtrbd/code/iv_discovery.
Open Datasets	Yes	We consider the application of our method to the study by Sachs et al. (2005). That study collected cell activity measurements (concentration of proteins and lipids) for single cell data under a variety of conditions.
Dataset Splits	No	Section 5.1: We generate 100 synthetic problems with a data set of 10,000 points each and assess methods using also subsets of size 1000 and 5000. ... Section 5.3: There are 11 variables and 853 data points, of which we selected the manipulation of concentration levels of molecule Erk as the treatment, and concentration of Akt as the outcome. ... First, we show a simple comparison based on the bootstrap: we sample the data with replacement and show the corresponding (bootstrapped) sampling distribution of the estimates obtained by NAIVE3, sis VIVE and the single-output IV-TETRAD+ with K = 3. The paper discusses data sizes and bootstrapping but does not define explicit train/test/validation splits for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	In our experiments in the next Section, Line 3 is implemented by running LARS (Efron et al., 2004), regressing separately X on O and Y on O. ... We use a Shapiro-Wilk test of Gaussianity followed by the HSIC test as implemented in the R package d HSIC, both at a level 0.05. No version numbers are provided for LARS or d HSIC.
Experiment Setup	Yes	Section 5.1: All error variables and latent variables are zero-mean Laplacian distributed, and coefficients are sampled from standard Gaussians and re-scaled such that the observed variables have a variance of 1. A simulation is rejected until \|λyx\| > 0.05. Coefficients between {X, Y } and latent variables {U1, U2} are set such that λxu1 = λyu1 = λyu2 = λyu2, at two levels, (0.125, 0.25). ... In Line 10, we formally check the validity of this candidate set by performing Wishart tests of conditional tetrad constraints (Wishart, 1928; Spirtes et al., 2000) for every possible pair in Wj against pair {X, Y }. Function Test Tetrads returns true if and only if more than half of the tested pairs return a test p-value greater than a particular pre-deﬁned level. In our experiments in the next section, this level was set at 0.05. ... IV-TETRAD+ ... using all variables O as input and K = 10. ... For this run we chose a window size K = 3, as K = 2 is in general too noisy of a choice since it cannot exploit any redundancies of the tetrad constraints implied by the candidate instrument sets.