Accelerating Cross-Validation in Multinomial Logistic Regression with $\ell_1$-Regularization

Authors: Tomoyuki Obuchi, Yoshiyuki Kabashima

JMLR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The usefulness of the approximate formula is demonstrated on simulated data and the ISOLET dataset from the UCI machine learning repository. MATLAB and python codes implementing the approximate formula are distributed in (Obuchi, 2017; Takahashi and Obuchi, 2017). Keywords: classification, multinomial logistic regression, cross-validation, linear perturbation, self-averaging approximation
Researcher Affiliation Academia Tomoyuki Obuchi EMAIL Yoshiyuki Kabashima EMAIL Department of Mathematical and Computing Science Tokyo Institute of Technology 2-12-1, Ookayama, Meguro-ku, Tokyo, Japan
Pseudocode Yes Algorithm 1 Approximate CV of the MLR 1: procedure ACV( ˆ W (λ1, λ2), DM, λ2) ... Algorithm 2 Self-averaging approximate CV of the MLR 1: procedure SAACV( ˆ W (λ1, λ2), DM, λ2)
Open Source Code Yes MATLAB and python codes implementing the approximate formula are distributed in (Obuchi, 2017; Takahashi and Obuchi, 2017).
Open Datasets Yes The usefulness of the approximate formula is demonstrated on simulated data and the ISOLET dataset from the UCI machine learning repository. MATLAB and python codes implementing the approximate formula are distributed in (Obuchi, 2017; Takahashi and Obuchi, 2017). Keywords: classification, multinomial logistic regression, cross-validation, linear perturbation, self-averaging approximation
Dataset Splits Yes In principle, we should compare our approximate result with that of the LOO CV (k = M) because our formula approximates it. However for large M, the literal LOO CV requires huge computational burdens despite that the result is empirically not much different from that of the k-hold CV with moderate ks. Hence in some of the following experiments with large M, we use the 10-hold CV instead of the LOO CV.
Hardware Specification Yes In all of the experiments, we used a single CPU of Intel(R) Xeon(R) E5-2630 v3 2.4GHz.
Software Dependencies Yes To solve the optimization problems in eqs. (4,6), we employed Glmnet (Friedman et al., 2010) which is implemented as a MEX subroutine in MATLAB R .
Experiment Setup Yes Unless explicitly mentioned, we set this as δ = 10 8 being tighter than the default value. This is necessary since we treat problems of rather large sizes. A looser choice for δ rather strongly affects the literal CV result, while it does not change the full solution or the training error as much. As a result, our approximations employing only the full solution are rather robust against the choice of δ compared to the literal CV.