An algorithmic view of L2 regularization and some path-following algorithms

Authors: Yunzhang Zhu, Renxiong Liu

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we use ℓ2-regularized logistic regression as an illustrating example to demonstrate the effectiveness of the proposed path-following algorithms. (...) In this section, we use ℓ2-regularized logistic regression as an illustrating example to study the operating characteristics of the various proposed methods.
Researcher Affiliation Academia Yunzhang Zhu EMAIL Department of Statistics The Ohio State University Columbus, OH 43210, USA Renxiong Liu EMAIL Department of Statistics The Ohio State University Columbus, OH 43210, USA
Pseudocode No The paper describes various algorithms (Newton method, gradient descent, Euler's method, Runge-Kutta method) using mathematical equations and prose (e.g., equations (7), (9), (11), (12)). However, it does not include any explicitly labeled pseudocode or algorithm blocks with structured, step-by-step procedures.
Open Source Code No The paper states: "We implement all methods in R using Rcpp (Eddelbuettel et al., 2011; Eddelbuettel, 2013), except for glmnet for which we use the R package glmnet." While this indicates their implementation, it does not provide an explicit statement about releasing their code or a link to a repository for the methods described in the paper.
Open Datasets No For the nonseparable case, we sample the components of the response vector Y Rn from a Bernoulli distribution, where P(Yi = +1) = 1/2 and P(Yi = 1) = 1/2 for i = 1, 2, . . . , n. Conditioned on Yi, we generate Xi s independently from Np(Yiµ, σ2Ip p), where µ Rp and σ2 > 0. (...) For the separable case, we generate Xi s independently from Np(Yiµ, Ip p) where µ = (1/ p, . . . , 1/ p) until Yiµ Xi > 1, which makes the two classes linearly separable. (...) Specifically, we first generate the predictors X1, . . . , Xn Rp from normal distribution Np(0, Ip p).
Dataset Splits No The paper describes generating synthetic data for its numerical studies under different scenarios (nonseparable, separable, generative model for logistic regression). It specifies how the data points are sampled and generated but does not mention any explicit training, validation, or test dataset splits in terms of percentages, counts, or predefined benchmarks.
Hardware Specification No The paper mentions implementation details: "We implement all methods in R using Rcpp (...) except for glmnet for which we use the R package glmnet." However, there is no mention of specific hardware used for running the experiments, such as CPU or GPU models, memory, or cloud computing specifications.
Software Dependencies No The paper mentions using "Rcpp (Eddelbuettel et al., 2011; Eddelbuettel, 2013)" and "the R package glmnet." While these are software tools, specific version numbers for Rcpp, the R environment, or the glmnet package are not provided in the text.
Experiment Setup Yes For both scenarios, three choices of problem dimensions are considered: (n, p) = (1000, 500), (n, p) = (1000, 1000), and (n, p) = (1000, 2000). (...) In all simulations, we use N = 100. We first compare the four second-order methods: Newton, Euler, Runge-Kutta, and the method of Rosset (2004) as they all involve solving linear systems. (...) We also consider two initial step sizes: α1 = 0.01, 0.1 for the Newton method to see the impact of α1 on the suboptimality. (...) we use tmax = 10 in all of the numerical experiments.