reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Pattern Alternating Maximization Algorithm for Missing Data in High-Dimensional Problems

Authors: Nicolas Städler, Daniel J. Stekhoven, Peter Bühlmann

JMLR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show on simulated and real data that the new method often improves upon other modern imputation techniques such as k-nearest neighbors imputation, nuclear norm minimization or a penalized likelihood approach with an ℓ1-penalty on the concentration matrix. Keywords: missing data, observed likelihood, (partial) Eand M-Step, Lasso, penalized variational free energy. 4. Numerical Experiments In this section we explore the performance of Miss PALasso in recovering missing entries and we report on computational efficiency of the algorithm.
Researcher Affiliation	Collaboration	Nicolas Stadler EMAIL The Netherlands Cancer Institute Plesmanlaan 121 1066 CX Amsterdam, The Netherlands Daniel J. Stekhoven EMAIL Quantik AG Bahnhofstrasse 57 8965 Berikon, Switzerland Peter B uhlmann EMAIL Seminar for Statistics, ETH Zurich R amistrasse 101 8092 Zurich, Switzerland
Pseudocode	Yes	Algorithm 1: Miss PA ... Algorithm 2: Miss PALasso
Open Source Code	No	The paper does not explicitly state that the code for the described methodology is open-source, nor does it provide a link to a code repository.
Open Datasets	Yes	4.1.2 Real Data Examples We consider the following four publicly available data sets: Isoprenoid gene network in Arabidopsis thaliana: ... Wille et al. (2004). Colon cancer: ... Alon et al. (1999). Lymphoma: ... Alizadeh et al. (2000). Yeast cell-cycle: ... Spellman et al. (1998).
Dataset Splits	Yes	In each run we generate n = 50 i.i.d. samples from the model. We then delete randomly 5%, 10% and 15% of the values in the data matrix, apply an imputation method and compute the NRMSE.
Hardware Specification	No	The paper discusses 'CPU times' but does not specify the type or model of CPU, GPU, or any other specific hardware used for the experiments.
Software Dependencies	No	We end this section by illustrating the computational timings of Miss PALasso and Miss GLasso implemented with the statistical computing language R. The paper mentions the use of 'R' but does not provide a specific version number for R or any specific libraries used.
Experiment Setup	Yes	In all of our experiments we select the tuning parameters to obtain optimal prediction of the missing entries in terms of NRMSE. ... For a ﬁxed λ we stop the algorithm if the relative change in imputation satisﬁes, ˆX(r+1) - ˆX(r) 2 / ˆX(r+1) 2 <= 10^-5.