Fast Computation of Leave-One-Out Cross-Validation for $k$-NN Regression
Authors: Motonobu Kanagawa
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments confirm the validity of the fast computation method. We empirically check the validity of the formula (7) for efficient LOOCV computation. We consider a real-valued regression problem where X = Rd and Y = R, using two real datasets from scikit-learn: Diabetes and Wine . |
| Researcher Affiliation | Academia | Motonobu Kanagawa EMAIL Data Science Department EURECOM |
| Pseudocode | No | The paper describes the method using mathematical formulas (Lemma 1, Corollary 1) and natural language, without structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code for reproducing the experiments is available on https://github.com/motonobuk/LOOCV-kNN |
| Open Datasets | Yes | We consider a real-valued regression problem where X = Rd and Y = R, using two real datasets from scikit-learn: Diabetes and Wine . |
| Dataset Splits | Yes | LOOCV for k-NN regression is defined as follows. ... For each ℓ= 1, . . . , n, consider the training dataset (1) with the ℓ-th pair (xℓ, yℓ) removed: Dn\{(xℓ, yℓ)} = {(x1, y1), . . . , (xℓ 1, yℓ 1), (xℓ+1, yℓ+1), . . . , (xn, yn)}. |
| Hardware Specification | Yes | CPU: 1.1 GHz Quad-Core Intel Core i5. Memory: 8 GB 3733 MHz LPDDR4X. |
| Software Dependencies | No | The paper mentions using 'scikit-learn' for implementing k-NN regression but does not specify a version number for scikit-learn or any other software dependency. Footnote 2 points to a general stable documentation URL rather than a specific version. |
| Experiment Setup | Yes | We standardized each input feature to have mean zero and unit variance. ... We show the LOOCV scores computed by the two methods for different values of k... for fixed k = 5. |