Improved Random Features for Dot Product Kernels
Authors: Jonas Wacker, Motonobu Kanagawa, Maurizio Filippone
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We describe the improvements brought by these contributions with extensive experiments on a variety of tasks and datasets. Keywords: Random features, randomized sketches, dot product kernels, polynomial kernels, large scale learning |
| Researcher Affiliation | Academia | Jonas Wacker EMAIL Motonobu Kanagawa EMAIL Data Science Department, EURECOM, France Maurizio Filippone EMAIL Statistics Program, KAUST, Saudi Arabia |
| Pseudocode | Yes | Algorithm 1: Real and Complex Tensor SRHT Algorithm 2: Incremental Algorithm Algorithm 3: Extended Incremental Algorithm Algorithm 4: Improved Random Maclaurin (RM) Features |
| Open Source Code | Yes | Software package. We provide a Git Hub repository3 with modern implementations for all the methods studied in this work supporting GPU acceleration and automatic differentiation in Py Torch (Paszke et al., 2019). Since version 1.8, Py Torch natively supports numerous linear algebra operations on complex numbers4. The same is true for Num Py (Harris et al., 2020) and Tensor Flow (Abadi et al., 2016). 3. Our code is available at: https://github.com/joneswack/dp-rfs |
| Open Datasets | Yes | All the datasets come from the UCI benchmark (Dua and Graff, 2017) except for Cod rna (Uzilov et al., 2006), Fashion MNIST (Xiao et al., 2017), and MNIST (Lecun et al., 1998). |
| Dataset Splits | Yes | The train/test split is 90/10 and is recomputed for every random seed for the UCI datasets; otherwise it is predefined. For each dataset, we use its random subsets of size m = min(5000, Ntrain) and m = min(5000, Ntest) to define training and test data in an experiment, respectively, where Ntrain and Ntest are the sizes of the original training and test datasets. |
| Hardware Specification | Yes | We recorded the time measurements on an NVIDIA P100 GPU and Py Torch version 1.10 with native complex linear algebra support. |
| Software Dependencies | Yes | We recorded the time measurements on an NVIDIA P100 GPU and Py Torch version 1.10 with native complex linear algebra support. The Py Torch 1.8 release notes are available at: https://github.com/pytorch/pytorch/releases/tag/v1.8.0 |
| Experiment Setup | Yes | For the optimized Maclaurin approach in Algorithm 3, we set pmin = 2 and pmax = 10. We use the training subset Xsub = {x1, . . . , xm} to precompute the U-statistics in Eq. (49) and Eq. (50). Regularization parameters. We select the regularization parameter in GP classification and regression by a training-validation procedure. That is, we use the 90 % of training data for training and the remaining 10 % for validation, and select the regularization parameter that maximizes the MNLL on the validation set. For GP classification, we choose the regularization parameter from the range α {10 5, . . . , 10 0}. For GP regression, we choose the noise variance from the range σ2 noise {2 15, . . . , 215}. |