reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ThunderSVM: A Fast SVM Library on GPUs and CPUs

Authors: Zeyi Wen, Jiashuai Shi, Qinbin Li, Bingsheng He, Jian Chen

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results show that Thunder SVM is generally an order of magnitude faster than Lib SVM while producing identical SVMs.
Researcher Affiliation	Academia	Zeyi Wen EMAIL Jiashuai Shi EMAIL Qinbin Li EMAIL Bingsheng He EMAIL Jian Chen EMAIL School of Computing, National University of Singapore, 117418, Singapore School of Software Engineering, South China University of Technology, Guangzhou, 510006, China
Pseudocode	No	The paper describes the SMO algorithm steps in prose but does not provide a formal pseudocode block or algorithm figure. For example: "For solving each subproblem in the training, we use the SMO algorithm which consists of three key steps. Step (i): Find two extreme training instances..."
Open Source Code	Yes	Documentation, examples, and more about Thunder SVM are available at https://github.com/zeyiwen/thundersvm. The full version of Thunder SVM, which is released under Apache License 2.0, can be found on Git Hub at https://github.com/zeyiwen/thundersvm.
Open Datasets	Yes	Five representative data sets are listed in Table 1. We conducted our experiments on a workstation running Linux with two Xeon E5-2640 v4 10 core CPUs, 256GB main memory and a Tesla P100 GPU of 12GB memory. Thunder SVM are implemented in CUDA-C and C++ with Open MP. We used the Gaussian kernel. Five pairs of hyper-parameters (C, γ) for the data sets are (10, 0.125), (100, 0.125), (0.01, 1), (256, 0.125), and (64,7.8125) and are the same as the existing studies (Wen et al., 2014, 2018).
Dataset Splits	No	The paper lists five datasets (mnist8m, rcv1 test, epsilon, e2006-tfidf, webdata) and mentions cross-validation as a functionality, but it does not specify the exact training/test/validation splits (e.g., percentages or counts) or the methodology used for splitting these datasets.
Hardware Specification	Yes	We conducted our experiments on a workstation running Linux with two Xeon E5-2640 v4 10 core CPUs, 256GB main memory and a Tesla P100 GPU of 12GB memory.
Software Dependencies	No	Thunder SVM are implemented in CUDA-C and C++ with Open MP. We use matrix operations from the high-performance library cu Sparse (Nvidia, 2008) provided by NVIDIA. While software components are mentioned, specific version numbers for CUDA-C, C++, Open MP, or cuSparse are not provided, which is necessary for reproducibility.
Experiment Setup	Yes	We used the Gaussian kernel. Five pairs of hyper-parameters (C, γ) for the data sets are (10, 0.125), (100, 0.125), (0.01, 1), (256, 0.125), and (64,7.8125) and are the same as the existing studies (Wen et al., 2014, 2018).