Accelerating Cross-Validation in Multinomial Logistic Regression with $\ell_1$-Regularization
Authors: Tomoyuki Obuchi, Yoshiyuki Kabashima
JMLR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The usefulness of the approximate formula is demonstrated on simulated data and the ISOLET dataset from the UCI machine learning repository. MATLAB and python codes implementing the approximate formula are distributed in (Obuchi, 2017; Takahashi and Obuchi, 2017). Keywords: classification, multinomial logistic regression, cross-validation, linear perturbation, self-averaging approximation |
| Researcher Affiliation | Academia | Tomoyuki Obuchi EMAIL Yoshiyuki Kabashima EMAIL Department of Mathematical and Computing Science Tokyo Institute of Technology 2-12-1, Ookayama, Meguro-ku, Tokyo, Japan |
| Pseudocode | Yes | Algorithm 1 Approximate CV of the MLR 1: procedure ACV( ˆ W (λ1, λ2), DM, λ2) ... Algorithm 2 Self-averaging approximate CV of the MLR 1: procedure SAACV( ˆ W (λ1, λ2), DM, λ2) |
| Open Source Code | Yes | MATLAB and python codes implementing the approximate formula are distributed in (Obuchi, 2017; Takahashi and Obuchi, 2017). |
| Open Datasets | Yes | The usefulness of the approximate formula is demonstrated on simulated data and the ISOLET dataset from the UCI machine learning repository. MATLAB and python codes implementing the approximate formula are distributed in (Obuchi, 2017; Takahashi and Obuchi, 2017). Keywords: classification, multinomial logistic regression, cross-validation, linear perturbation, self-averaging approximation |
| Dataset Splits | Yes | In principle, we should compare our approximate result with that of the LOO CV (k = M) because our formula approximates it. However for large M, the literal LOO CV requires huge computational burdens despite that the result is empirically not much different from that of the k-hold CV with moderate ks. Hence in some of the following experiments with large M, we use the 10-hold CV instead of the LOO CV. |
| Hardware Specification | Yes | In all of the experiments, we used a single CPU of Intel(R) Xeon(R) E5-2630 v3 2.4GHz. |
| Software Dependencies | Yes | To solve the optimization problems in eqs. (4,6), we employed Glmnet (Friedman et al., 2010) which is implemented as a MEX subroutine in MATLAB R . |
| Experiment Setup | Yes | Unless explicitly mentioned, we set this as δ = 10 8 being tighter than the default value. This is necessary since we treat problems of rather large sizes. A looser choice for δ rather strongly affects the literal CV result, while it does not change the full solution or the training error as much. As a result, our approximations employing only the full solution are rather robust against the choice of δ compared to the literal CV. |