reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

L0Learn: A Scalable Package for Sparse Learning using L0 Regularization

Authors: Hussein Hazimeh, Rahul Mazumder, Tim Nonet

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our main goal in these experiments is to compare the running time of L0Learn with similar toolkits, designed for sparse learning problems. Speciﬁcally, we compare with glmnet, ncvreg, picasso, and abess. For space constraints, we focus on linear regression, and refer the reader to Dedieu et al. (2021) for sparse classiﬁcation experiments. Our experiments also shed some light on the statistical performance of the diﬀerent approaches: for indepth studies of statistical properties, see Hazimeh and Mazumder (2020); Hastie et al. (2020); Mazumder et al. (2023).
Researcher Affiliation	Collaboration	Hussein Hazimeh EMAIL Google Research Rahul Mazumder EMAIL Massachusetts Institute of Technology Tim Nonet EMAIL Massachusetts Institute of Technology
Pseudocode	No	The paper describes algorithms in prose (e.g., "L0Learn uses a combination of (i) cyclic CD and (ii) local combinatorial optimization") but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We present L0Learn: an open-source package for sparse linear regression and classiﬁcation using ℓ0 regularization. L0Learn is available on both CRAN and Git Hub.1 1Links: https://cran.r-project.org/package=L0Learn and https://github.com/hazimehh/L0Learn
Open Datasets	No	Following Hazimeh and Mazumder (2020), we consider synthetic data as per a linear regression model under the ﬁxed design setting (exponential correlation model with ρ = 0.3).
Dataset Splits	Yes	cv_fit <L0Learn.cvfit(x, y, penalty="L0", n Folds =5) # 5-fold cross validation All competing methods are tuned to minimize MSE on a validation set with the same size as the training set.
Hardware Specification	Yes	Experiments were performed on a Linux c5n.2xlarge EC2 instance running R 4.0.2.
Software Dependencies	Yes	Experiments were performed on a Linux c5n.2xlarge EC2 instance running R 4.0.2.
Experiment Setup	Yes	In L0Learn, we used the default CD algorithm with the ℓ0ℓ2 penalty. In picasso, we used ℓ1 regularization and changed the convergence threshold (prec) to 10 10 so that its solutions roughly match those of glmnet. In ncvreg, we used the (default) MCP penalty. All competing methods are tuned to minimize MSE on a validation set with the same size as the training set. In L0Learn, ncvreg, and abess, we tune over a twodimensional grid consisting of 100 λ values (chosen automatically by the toolkits) and 100 γ values (in the range [10 2, 102] for L0Learn, [1.5, 103] for ncvreg, [1, 103] for abess4).