Hierarchically Compositional Kernels for Scalable Nonparametric Learning

Authors: Jie Chen, Haim Avron, Vikas Sindhwani

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Although other randomized approximate kernels entail a similar complexity, empirical results show that the proposed kernel achieves a matching performance with a smaller r. We demonstrate comprehensive experiments to show the effective use of the proposed kernel on data sizes up to the order of millions. Keywords: Hierarchical kernels, Nonparametric Learning
Researcher Affiliation Collaboration Jie Chen EMAIL IBM Thomas J. Watson Research Center Haim Avron EMAIL Tel Aviv University Vikas Sindhwani EMAIL Google Brain, NYC
Pseudocode Yes For completeness, we summarize the pesudocode in Algorithm 1 as a reference for computer implementation. Algorithm 1 Computing y = Ab Algorithm 2 Computing A = A 1 Algorithm 3 Computing z = w T khierarchical(X, x) for x / X
Open Source Code No The paper states: "The programs were implemented in C++. The linear algebra routines (BLAS and LAPACK) were linked against the IBM ESSL library and some obviously parallelizable for-loops were decorated by Open MP pragmas, so that the experiments with large data sets can be completed in a reasonable time frame." This describes the implementation details and libraries used but does not provide concrete access to the authors' own source code for the methodology described.
Open Datasets Yes Table 1 summarizes the data sets used for experiments. They were all downloaded from http://www.csie.ntu.edu.tw/~cjlin/libsvm/ and are widely used as benchmarks of kernel methods.
Dataset Splits Yes Some of the data sets come with a training and a testing part; for those not, we performed a 4:1 split to create the two parts, respectively.
Hardware Specification Yes The experiments were performed on one node of a computing cluster with 512 GB memory and an IBM POWER8 12-core processor, unless otherwise stated.
Software Dependencies No The programs were implemented in C++. The linear algebra routines (BLAS and LAPACK) were linked against the IBM ESSL library and some obviously parallelizable for-loops were decorated by Open MP pragmas, so that the experiments with large data sets can be completed in a reasonable time frame. However, the programs were neither fully optimized nor fully parallelized, leaving room for improvement on both memory consumption and execution time.
Experiment Setup Yes We use the Gaussian kernel (5) as an example. As hinted earlier, the choice of the range parameter σ affects the quality of various kernels. Therefore, the experiment setup is to use a reasonable regularization λ = 0.01 and to vary the choice of σ in a large interval (between 0.01 and 100) such that the optimal σ falls within the interval. We set the rank r (and the leaf size n0) according to (22) with three particular choices: r = 32, 129, and 516.