The Proximal ID Algorithm

Authors: Ilya Shpitser, Zach Wood-Doughty, Eric J. Tchetgen Tchetgen

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate our approach by simulation studies and a data application. ... 7. Simulations ... We now turn to an array of simulation studies to demonstrate how the identifying assumptions of the proximal ID algorithm can enable unbiased estimation. ... 8. Analysis Of The Effect Of Methotextrate ... We now apply our proximal front-door estimator to an analysis of the effect of methotextrate (MTX) on tender joint count in patients with rheumatoid arthritis.
Researcher Affiliation Academia Ilya Shpitser EMAIL Department of Computer Science Johns Hopkins University Baltimore, MD 21218, USA; Zach Wood-Doughty EMAIL Department of Computer Science Northwestern University Evanston, IL 60208, USA; Eric J. Tchetgen Tchetgen EMAIL Department of Statistics The Wharton School 3620 Locust Walk, Philadelphia, PA 19104, USA
Pseudocode No The paper describes the 'ID algorithm' and 'proximal ID algorithm' conceptually, explaining their steps and operations (e.g., 'fixing operator φ'), and provides mathematical formulas. However, it does not present these algorithms in a structured pseudocode block or a clearly labeled algorithm environment. The description is primarily prose and mathematical notation.
Open Source Code Yes Our code implementing our methods and generating our datasets may be found in the following online repository: https://github.com/zachwooddoughty/proximal_id_algorithm. ... Code for preprocessing the data and reproducing these results are provided in the following repository: https://github.com/zachwooddoughty/proximal_id_algorithm.
Open Datasets No The dataset we examine, originally described in Choi et al. (2002), has been studied in several analyses (Fewell et al., 2004; Whittle and Hughes, 2004). ... Access to the dataset itself may be requested by contacting the third author.
Dataset Splits No The paper mentions generating synthetic datasets (e.g., 'we sample 64 datasets from each of four DGPs', 'each dataset contains 4000 samples') and using bootstrap resampling ('nonparametric bootstrap with 64 resamplings'). For the real-world application, it specifies patient counts ('1,010 patients'). However, it does not provide specific training, validation, or test splits (percentages or counts) for any of these datasets.
Hardware Specification No The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory specifications, or cloud computing instances) used for running its simulations or data application.
Software Dependencies No The paper does not specify any particular software libraries, frameworks, or tools with their version numbers that were used for implementing the methods or running the experiments.
Experiment Setup Yes First, we estimate a propensity score model for M, p(M | A, Z, C). Using this model to weight... we estimate this function using generalized method of moments (GMM). ... We truncate weights at the 2.5th and 97.5th percentiles. ... we sample 100 trajectories of Y (a = 1) and Y (a = 0) ... For each of the four DGPs we consider, we modify the parameters of the sampling distribution by changing the A Y coefficient to a value βAY {0, 0.2, 0.4, 0.8}. For each value βAY , we sample 256 datasets of 4000 samples. ... nonparametric bootstrap with 64 resamplings to produce a 95% confidence interval.