reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An algorithmic view of L2 regularization and some path-following algorithms

Authors: Yunzhang Zhu, Renxiong Liu

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we use ℓ2-regularized logistic regression as an illustrating example to demonstrate the effectiveness of the proposed path-following algorithms. (...) In this section, we use ℓ2-regularized logistic regression as an illustrating example to study the operating characteristics of the various proposed methods.
Researcher Affiliation	Academia	Yunzhang Zhu EMAIL Department of Statistics The Ohio State University Columbus, OH 43210, USA Renxiong Liu EMAIL Department of Statistics The Ohio State University Columbus, OH 43210, USA
Pseudocode	No	The paper describes various algorithms (Newton method, gradient descent, Euler's method, Runge-Kutta method) using mathematical equations and prose (e.g., equations (7), (9), (11), (12)). However, it does not include any explicitly labeled pseudocode or algorithm blocks with structured, step-by-step procedures.
Open Source Code	No	The paper states: "We implement all methods in R using Rcpp (Eddelbuettel et al., 2011; Eddelbuettel, 2013), except for glmnet for which we use the R package glmnet." While this indicates their implementation, it does not provide an explicit statement about releasing their code or a link to a repository for the methods described in the paper.
Open Datasets	No	For the nonseparable case, we sample the components of the response vector Y Rn from a Bernoulli distribution, where P(Yi = +1) = 1/2 and P(Yi = 1) = 1/2 for i = 1, 2, . . . , n. Conditioned on Yi, we generate Xi s independently from Np(Yiµ, σ2Ip p), where µ Rp and σ2 > 0. (...) For the separable case, we generate Xi s independently from Np(Yiµ, Ip p) where µ = (1/ p, . . . , 1/ p) until Yiµ Xi > 1, which makes the two classes linearly separable. (...) Speciﬁcally, we ﬁrst generate the predictors X1, . . . , Xn Rp from normal distribution Np(0, Ip p).
Dataset Splits	No	The paper describes generating synthetic data for its numerical studies under different scenarios (nonseparable, separable, generative model for logistic regression). It specifies how the data points are sampled and generated but does not mention any explicit training, validation, or test dataset splits in terms of percentages, counts, or predefined benchmarks.
Hardware Specification	No	The paper mentions implementation details: "We implement all methods in R using Rcpp (...) except for glmnet for which we use the R package glmnet." However, there is no mention of specific hardware used for running the experiments, such as CPU or GPU models, memory, or cloud computing specifications.
Software Dependencies	No	The paper mentions using "Rcpp (Eddelbuettel et al., 2011; Eddelbuettel, 2013)" and "the R package glmnet." While these are software tools, specific version numbers for Rcpp, the R environment, or the glmnet package are not provided in the text.
Experiment Setup	Yes	For both scenarios, three choices of problem dimensions are considered: (n, p) = (1000, 500), (n, p) = (1000, 1000), and (n, p) = (1000, 2000). (...) In all simulations, we use N = 100. We ﬁrst compare the four second-order methods: Newton, Euler, Runge-Kutta, and the method of Rosset (2004) as they all involve solving linear systems. (...) We also consider two initial step sizes: α1 = 0.01, 0.1 for the Newton method to see the impact of α1 on the suboptimality. (...) we use tmax = 10 in all of the numerical experiments.