Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
ThunderSVM: A Fast SVM Library on GPUs and CPUs
Authors: Zeyi Wen, Jiashuai Shi, Qinbin Li, Bingsheng He, Jian Chen
JMLR 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results show that Thunder SVM is generally an order of magnitude faster than Lib SVM while producing identical SVMs. |
| Researcher Affiliation | Academia | Zeyi Wen EMAIL Jiashuai Shi EMAIL Qinbin Li EMAIL Bingsheng He EMAIL Jian Chen EMAIL School of Computing, National University of Singapore, 117418, Singapore School of Software Engineering, South China University of Technology, Guangzhou, 510006, China |
| Pseudocode | No | The paper describes the SMO algorithm steps in prose but does not provide a formal pseudocode block or algorithm figure. For example: "For solving each subproblem in the training, we use the SMO algorithm which consists of three key steps. Step (i): Find two extreme training instances..." |
| Open Source Code | Yes | Documentation, examples, and more about Thunder SVM are available at https://github.com/zeyiwen/thundersvm. The full version of Thunder SVM, which is released under Apache License 2.0, can be found on Git Hub at https://github.com/zeyiwen/thundersvm. |
| Open Datasets | Yes | Five representative data sets are listed in Table 1. We conducted our experiments on a workstation running Linux with two Xeon E5-2640 v4 10 core CPUs, 256GB main memory and a Tesla P100 GPU of 12GB memory. Thunder SVM are implemented in CUDA-C and C++ with Open MP. We used the Gaussian kernel. Five pairs of hyper-parameters (C, γ) for the data sets are (10, 0.125), (100, 0.125), (0.01, 1), (256, 0.125), and (64,7.8125) and are the same as the existing studies (Wen et al., 2014, 2018). |
| Dataset Splits | No | The paper lists five datasets (mnist8m, rcv1 test, epsilon, e2006-tfidf, webdata) and mentions cross-validation as a functionality, but it does not specify the exact training/test/validation splits (e.g., percentages or counts) or the methodology used for splitting these datasets. |
| Hardware Specification | Yes | We conducted our experiments on a workstation running Linux with two Xeon E5-2640 v4 10 core CPUs, 256GB main memory and a Tesla P100 GPU of 12GB memory. |
| Software Dependencies | No | Thunder SVM are implemented in CUDA-C and C++ with Open MP. We use matrix operations from the high-performance library cu Sparse (Nvidia, 2008) provided by NVIDIA. While software components are mentioned, specific version numbers for CUDA-C, C++, Open MP, or cuSparse are not provided, which is necessary for reproducibility. |
| Experiment Setup | Yes | We used the Gaussian kernel. Five pairs of hyper-parameters (C, γ) for the data sets are (10, 0.125), (100, 0.125), (0.01, 1), (256, 0.125), and (64,7.8125) and are the same as the existing studies (Wen et al., 2014, 2018). |