reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Semiparametric Inference For Causal Effects In Graphical Models With Hidden Variables

Authors: Rohit Bhattacharya, Razieh Nabi, Ilya Shpitser

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we describe a set of simulations to illustrate the key results presented in this paper. For each experiment, we generate data according to hidden variable DAGs that give rise to the latent projection ADMGs used in the motivating examples throughout the paper. [...] We analyzed the bias, variance, and robustness behaviors of our proposed estimators (Primal IPW, Dual IPW, Augmented Primal IPW, and Nested IPW) and compared them with the plug-in estimators. We further, evaluated the performance of the eﬃcient inﬂuence function in the mb-shielded ADMG of Figure 3.
Researcher Affiliation	Academia	Rohit Bhattacharya EMAIL Department of Computer Science Williams College Williamstown, MA 01267, USA. Razieh Nabi EMAIL Department of Biostatistics and Bioinformatics Emory University Atlanta, GA 30322, USA. Ilya Shpitser EMAIL Department of Computer Science Johns Hopkins University Baltimore, MD 21218, USA.
Pseudocode	Yes	Algorithm 1 Check Nonparametric Saturation (G) [...] Algorithm 2 Nested IPW Functional (G(V ), p(V ), τ)
Open Source Code	Yes	For Python implementations, see the open source package Ananke (link: https://ananke.readthedocs.io/en/latest/)
Open Datasets	No	In this section, we describe a set of simulations to illustrate the key results presented in this paper. For each experiment, we generate data according to hidden variable DAGs that give rise to the latent projection ADMGs used in the motivating examples throughout the paper. [...] We provide an example of a data generating process in Appendix G.
Dataset Splits	No	The paper describes generating data for simulations (e.g., 'generate data according to hidden variable DAGs', 'For a given sample size, we iterate over 100 replications') but does not specify any train/test/validation splits for a fixed dataset, as the data is simulated on the fly for each run.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the simulations or experiments.
Software Dependencies	No	The paper mentions using 'generalized additive models' and an 'open source package Ananke' for Python implementations, but does not specify version numbers for these or any other key software components (e.g., Python, R, specific libraries, or solvers).
Experiment Setup	No	The paper describes simulation scenarios, model misspecification strategies, and the number of replications for simulations ('For a given sample size, we iterate over 100 replications'). However, it does not provide specific experimental setup details like hyperparameter values (e.g., learning rates, batch sizes for generalized additive models) or specific training configurations for the models used in the experiments.