SPSD Matrix Approximation vis Column Selection: Theories, Algorithms, and Extensions
Authors: Shusen Wang, Luo Luo, Zhihua Zhang
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 6 we conduct experiments to compare among the column sampling algorithms. In Section 8 we empirically evaluate the proposed spectral shifting model. Experiments demonstrate that MEKA can be significantly improved by spectral shifting. We empirically conduct comparison among three column selection algorithms uniform sampling, uniform + adaptive2, and the near-optimal + adaptive sampling algorithm. We perform experiments on several datasets collected on the LIBSVM website. |
| Researcher Affiliation | Academia | Shusen Wang EMAIL Department of Statistics University of California at Berkeley Berkeley, CA 94720 Luo Luo EMAIL Zhihua Zhang EMAIL Department of Computer Science and Engineering Shanghai Jiao Tong University 800 Dong Chuan Road, Shanghai, China 200240 |
| Pseudocode | Yes | Algorithm 1 Computing the Prototype Model in O(nc + nd) Memory. Algorithm 2 The Adaptive Sampling Algorithm. Algorithm 3 The Uniform+Adaptive2 Algorithm. Algorithm 4 The Incomplete Uniform+Adaptive2 Algorithm. Algorithm 5 The Spectral Shifting Method. |
| Open Source Code | No | The paper does not provide concrete access to its own source code. It mentions using 'the code released by the authors with default settings' for MEKA (a third-party tool), but no statement or link for the code developed in this paper. |
| Open Datasets | Yes | We perform experiments on several datasets collected on the LIBSVM website http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/ |
| Dataset Splits | Yes | We perform a five-fold cross-validation without using kernel approximation to predetermine the two parameters σ and γ, and the same parameters are used for all the kernel approximation methods. For each of the compared methods, we randomly hold 80% samples for training and the rest for test; we repeat this procedure 50 times and record the average MSE |
| Hardware Specification | Yes | We run the algorithms on a workstation with Intel Xeon 2.40GHz CPUs, 24GB memory, and 64bit Windows Server 2008 system. |
| Software Dependencies | No | The models and algorithms are all implemented in MATLAB. We set MATLAB in single thread mode by the command max Num Comp Threads(1). The paper mentions MATLAB but does not specify a version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | We set the target rank k to be k = n/100 in all the experiments unless otherwise specified. We evaluate the performance by Approximation Error = K K F / K F. We set γ in the following way. Letting p = 0.05n , we define η Pp i=1 λ2 i (K) Pn i=1 λ2 i (K) = Kp 2 F K 2 F , which denotes the ratio of the top 5% eigenvalues of the kernel matrix K to the all eigenvalues. For each dataset, we use two different settings of γ such that η = 0.5 or η = 0.9. We use the Gaussian RBF kernel and tune two parameters: the variance σ2 and the kernel scaling parameter γ. We list the obtained parameters in Table 4. |