Hierarchically Compositional Kernels for Scalable Nonparametric Learning
Authors: Jie Chen, Haim Avron, Vikas Sindhwani
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Although other randomized approximate kernels entail a similar complexity, empirical results show that the proposed kernel achieves a matching performance with a smaller r. We demonstrate comprehensive experiments to show the effective use of the proposed kernel on data sizes up to the order of millions. Keywords: Hierarchical kernels, Nonparametric Learning |
| Researcher Affiliation | Collaboration | Jie Chen EMAIL IBM Thomas J. Watson Research Center Haim Avron EMAIL Tel Aviv University Vikas Sindhwani EMAIL Google Brain, NYC |
| Pseudocode | Yes | For completeness, we summarize the pesudocode in Algorithm 1 as a reference for computer implementation. Algorithm 1 Computing y = Ab Algorithm 2 Computing A = A 1 Algorithm 3 Computing z = w T khierarchical(X, x) for x / X |
| Open Source Code | No | The paper states: "The programs were implemented in C++. The linear algebra routines (BLAS and LAPACK) were linked against the IBM ESSL library and some obviously parallelizable for-loops were decorated by Open MP pragmas, so that the experiments with large data sets can be completed in a reasonable time frame." This describes the implementation details and libraries used but does not provide concrete access to the authors' own source code for the methodology described. |
| Open Datasets | Yes | Table 1 summarizes the data sets used for experiments. They were all downloaded from http://www.csie.ntu.edu.tw/~cjlin/libsvm/ and are widely used as benchmarks of kernel methods. |
| Dataset Splits | Yes | Some of the data sets come with a training and a testing part; for those not, we performed a 4:1 split to create the two parts, respectively. |
| Hardware Specification | Yes | The experiments were performed on one node of a computing cluster with 512 GB memory and an IBM POWER8 12-core processor, unless otherwise stated. |
| Software Dependencies | No | The programs were implemented in C++. The linear algebra routines (BLAS and LAPACK) were linked against the IBM ESSL library and some obviously parallelizable for-loops were decorated by Open MP pragmas, so that the experiments with large data sets can be completed in a reasonable time frame. However, the programs were neither fully optimized nor fully parallelized, leaving room for improvement on both memory consumption and execution time. |
| Experiment Setup | Yes | We use the Gaussian kernel (5) as an example. As hinted earlier, the choice of the range parameter σ affects the quality of various kernels. Therefore, the experiment setup is to use a reasonable regularization λ = 0.01 and to vary the choice of σ in a large interval (between 0.01 and 100) such that the optimal σ falls within the interval. We set the rank r (and the leaf size n0) according to (22) with three particular choices: r = 32, 129, and 516. |