reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High-Dimensional Inference for Generalized Linear Models with Hidden Confounding

Authors: Jing Ouyang, Kean Ming Tan, Gongjun Xu

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The ﬁnite sample performance of the proposed method is demonstrated through extensive numerical studies and an application to a genetic data set.
Researcher Affiliation	Academia	Jing Ouyang EMAIL Kean Ming Tan EMAIL Gongjun Xu EMAIL Department of Statistics University of Michigan Ann Arbor, MI 48109, USA
Pseudocode	No	The paper describes methods in Section 3 'Estimation Method' using prose and mathematical equations but does not present any formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an unambiguous statement from the authors stating they are releasing code for the methodology described, nor does it provide a direct link to a source-code repository.
Open Datasets	Yes	In this section, we apply the proposed method to a genetic data containing gene expression quantiﬁcations and stimulation statuses in mouse bone marrow derived dendritic cells. The data were also previously analyzed in Shalek et al. (2014) and Cai et al. (2023).
Dataset Splits	No	The paper describes the composition of the dataset: 'three groups of cells including 64 PAM stimulated cells, 96 PIC stimulated cells, and 96 LPS stimulated cells, respectively. Moreover, each of the three groups contains 96 control cells without any stimulation.' However, it does not specify any training, validation, or test splits for experimental reproduction.
Hardware Specification	No	The paper mentions numerical studies and simulations but does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper describes the methods and numerical studies, but it does not specify any software names with version numbers for replication.
Experiment Setup	Yes	The sparsity tuning parameters for all of the aforementioned methods are selected using 10-fold cross-validation. Our proposed method involves estimating the dimension of the unmeasured confounders, which we estimate using the parallel analysis (Horn, 1965; Dinno, 2009).