reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Method of Contraction-Expansion (MOCE) for Simultaneous Inference in Linear Models

Authors: Fei Wang, Ling Zhou, Lu Tang, Peter X.K. Song

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We establish key theoretical results for the inference from the proposed MOCE procedure. Once the expanded model is properly selected, the theoretical guarantees and simultaneous conﬁdence regions can be constructed by the joint asymptotic normal distribution. ... Through simulation experiments, Section 6 illustrates performances of MOCE, with comparison to existing methods.
Researcher Affiliation	Collaboration	Fei Wang EMAIL Car Gurus, Cambridge, MA 02141, USA and Tencent, Shenzhen, Guangdong 518057, China Ling Zhou EMAIL Southwestern University of Finance and Economics, Chengdu, Sichuan 611130, China Lu Tang EMAIL University of Pittsburgh, Pittsburgh, PA 15261, USA Peter X.K. Song EMAIL University of Michigan, Ann Arbor, MI 48109, USA
Pseudocode	Yes	Algorithm 1: Algorithm for model expansion via the method of forward screening Algorithm 2: Algorithm for ridge parameter selection
Open Source Code	No	The paper does not contain any explicit statements about the release of source code, nor does it provide a link to a code repository. It mentions using existing R packages (glmnet, hdi) but not releasing their own implementation.
Open Datasets	No	We simulate 500 data according to the following linear model: y = Xβ + ϵ, ϵ = (ϵi, . . . , ϵn)T , ϵi i.i.d. N(0, σ2), i = 1, . . . , n, where σ = 0.5, and the s0 signal parameters in set A are generated from the uniform distribution U(0.1, 0.5), and the rest of p s0 parameters in Ac are all set at 0. Each row of the design matrix X is simulated by a p-variate normal distribution N(0, σ2R(α)), where R(α) is a ﬁrst-order autoregressive (i.e. AR-1) correlation matrix with correlation parameter α {0.5, 0.7}. Each of the p columns in X is normalized to satisfy ℓ2-norm 1.
Dataset Splits	Yes	To apply MOCE, we begin with the LASSO estimate ˆβλ that is calculated by the R package glmnet with the tuning parameter λ selected by a 10-fold cross validation
Hardware Specification	No	The paper does not specify any particular hardware (CPU, GPU models, memory, etc.) used for running the simulations or experiments.
Software Dependencies	No	To apply MOCE, we begin with the LASSO estimate ˆβλ that is calculated by the R package glmnet with the tuning parameter λ selected by a 10-fold cross validation... To calculate the competing LDP estimator proposed by Zhang and Zhang (2014), denoted by ˆβLDP , we use the existing R package hdi... The paper mentions specific R packages but does not provide their version numbers, nor the version of R itself.
Experiment Setup	Yes	We simulate 500 data according to the following linear model: y = Xβ + ϵ, ϵ = (ϵi, . . . , ϵn)T , ϵi i.i.d. N(0, σ2), i = 1, . . . , n, where σ = 0.5, and the s0 signal parameters in set A are generated from the uniform distribution U(0.1, 0.5), and the rest of p s0 parameters in Ac are all set at 0. Each row of the design matrix X is simulated by a p-variate normal distribution N(0, σ2R(α)), where R(α) is a ﬁrst-order autoregressive (i.e. AR-1) correlation matrix with correlation parameter α {0.5, 0.7}. Each of the p columns in X is normalized to satisfy ℓ2-norm 1. We run 500 rounds of simulations to draw summary statistics in the evaluation. ... the tuning parameter λ selected by a 10-fold cross validation, where the variance parameter σ2 is estimated by ˆσ2 = 1 n ˆs y X ˆβλ 2 2... Starting with the LASSO selected model ˆ A, we construct the expanded model A via Algorithm 1 with the target size s = ˆs + 0.05p. The two ridge parameters τa in τ a = τa I and τc in τ c = τc I are chosen with the utility of Algorithm 2. Here we set η = 0.05 to allow 5% of LASSO estimated null parameters enter the expanded model.