Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Gap Safe Screening Rules for Sparsity Enforcing Penalties
Authors: Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon
JMLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we present results obtained with the Gap Safe rules on various data sets. Implementation has been done in Python and Cython (Behnel et al., 2011) for low level critical parts. A coordinate descent algorithm is used with a scaled dual gap stopping criterion i.e., we normalize the targeted accuracy ϵ (in the stopping criterion) in order to have a running time that is independent from the data scaling, i.e., ϵ Ð ϵ y 2 2 for the regression cases and ϵ Ð ϵ minpn1, n2q{n where ni is the number of observations in the class i, for the logistic cases. Note that in the Lasso case, to compare our method with the un-safe strong rules by Tibshirani et al. (2012) and with the sequential screening rule such as the eddp+ by Wang et al. (2012), we have added an approximated KKT post-processing step. We do this following Footnote 8, since they require the previous (exact) dual optimal solution which is not available in practice (Ndiaye et al., 2016b, Appendix B). |
| Researcher Affiliation | Academia | Eugene Ndiaye EMAIL LTCI, T el ecom Paris Tech, Universit e Paris-Saclay, 75013, Paris, France; Olivier Fercoq EMAIL LTCI, T el ecom Paris Tech, Universit e Paris-Saclay, 75013, Paris, France; Alexandre Gramfort EMAIL Inria, Universit e Paris-Saclay, 91120, Palaiseau, France LTCI, T el ecom Paris Tech, Universit e Paris-Saclay, 75013, Paris, France; Joseph Salmon EMAIL LTCI, T el ecom Paris Tech, Universit e Paris-Saclay, 75013, Paris, France. |
| Pseudocode | Yes | Algorithm 1 Pathwise algorithm with active warm start Algorithm 2 Iterative solver with GAP safe rules: Solver p X, y, β, ϵ, K, fce, λq |
| Open Source Code | Yes | The source code can be found in https://github.com/Eugene Ndiaye. |
| Open Datasets | Yes | We used the classic dense data set Leukemia, and the large sparse financial data set E2006-log1p available from LIBSVM10. 10. http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ We consider the data set NCEP/NCAR Reanalysis 1 Kalnay et al. (1996) which contains monthly means of climate data measurements spread across the globe in a grid of 2.5 ˆ2.5 resolutions (longitude and latitude 144 ˆ 73) from 1948{1{1 to 2015{10{31. |
| Dataset Splits | Yes | We choose the parameter τ in the set t0, 0.1, . . . , 0.9, 1u by splitting in half the observations, and run a training-test validation procedure. For each value of τ, we require a duality gap of 10 8 on the training set and pick the best one in term of prediction accuracy on the test set. |
| Hardware Specification | No | No specific hardware details are provided in the paper. |
| Software Dependencies | No | Implementation9 has been done in Python and Cython (Behnel et al., 2011) for low level critical parts. 9. The source code can be found in https://github.com/Eugene Ndiaye. This mentions Python and Cython but does not provide specific version numbers, which is required for reproducibility. |
| Experiment Setup | Yes | The tuning of the parameter λ in Problem (3) is crucial and is usually done by cross-validation which requires evaluation over a grid of parameter values. A standard grid considered in the literature is λt λmax10 δt{p T 1q with a small δ, say δ 10 2 or 10 3, see for instance (B uhlmann and van de Geer, 2011)[2.12.1] or the glmnet package (Friedman et al., 2010). In all our experiments, we have set this screening frequency parameter to fce 10. We choose the parameter τ in the set t0, 0.1, . . . , 0.9, 1u by splitting in half the observations, and run a training-test validation procedure. For each value of τ, we require a duality gap of 10 8 on the training set and pick the best one in term of prediction accuracy on the test set. Since the prediction error degrades increasingly for λ ď λmax{10 2.5, we fix δ 2.5. We have fixed the weight wg 1 since all groups have the same size. |