reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Nyström method for convex loss functions

Authors: Andrea Della Vecchia, Ernesto De Vito, Jaouad Mourtada, Lorenzo Rosasco

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we complement our theoretical results investigating numerically the statistical and computational trade-oﬀs in a relevant setting. More precisely, we report simple experiments in the context of kernel methods, considering Nystr om techniques. In particular, we choose the hinge loss, hence SVM for classiﬁcation. Keeping in mind Theorem 10 we expect we can match the performances of kernel-SVM using a Nystr om approximation with only m n centers. [...] We compare with linear (used only as baseline) and K-SVM see Table 5. For all the datasets, the Nystr om-Pegasos approach achieves comparable performances of K-SVM with much better time requirements (except for the small-size Usps). Moreover, note that K-SVM cannot be run on millions of points (SUSY), whereas Nystr om-Pegasos is still fast and provides much better results than linear SVM. Further comparisons with state-of-art algorithms for SVM are left for a future work. Finally, in Figure 2 we illustrate the interplay between λ and m for the Nystr om-Pegasos considering SUSY dataset.
Researcher Affiliation	Academia	Andrea Della Vecchia EMAIL EPFL and Swiss Finance Institute, Lausanne, Switzerland; Ernesto De Vito EMAIL Ma LGa Center DIMA Universit a di Genova, Italy; Jaouad Mourtada EMAIL CREST, ENSAE Institut Polytechnique de Paris, France; Lorenzo Rosasco EMAIL Ma LGa Center DIBRIS Universit a di Genova, Italy CBMM Massachusets Institute of Technology, Cambridge, MA, USA Istituto Italiano di Tecnologia, Genoa, Italy
Pseudocode	No	The paper describes methods textually and mathematically but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	Python implementation from https://github.com/ejlb/pegasos. This refers to a third-party implementation of Pegasos, not the authors' specific code for the methodology described in this paper. No explicit statement about code release for their own work or a link to their repository is provided.
Open Datasets	Yes	Datasets available from LIBSVM website http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ and from (Jose et al., 2013) http://manikvarma.org/code/LDKL/download.html#Jose13. Specific datasets mentioned in Appendix G include SUSY, Mnist binary, Usps, Webspam, a9a, CIFAR.
Dataset Splits	Yes	We report classiﬁcation error and for data sets with no ﬁxed test set, we set apart 20% of the data. (From Section 8 and Appendix G)
Hardware Specification	Yes	Table 5: Architecture: single machine with AMD EPYC 7301 16-Core Processor and 256GB of RAM.
Software Dependencies	No	The paper mentions 'Python implementation', 'Lib SVM library', 'Scikit-learn', and 'Pegasos' but does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	For Nystr om SVM with Pegaos we tuned the kernel parameter σ and λ regularizer with a simple grid search (σ [0.1, 20], λ [10 8, 10 1], initially with a coarse grid and then more reﬁned around the best candidates). [...] SUSY (... Gaussian kernel with σ = 4, λ = 3 10 6 ...); Mnist binary (... Gaussian kernel with σ = 10, λ = 3 10 6 ...); Usps (... Gaussian kernel with σ = 10, λ = 5 10 6 ...); Webspam (... Gaussian kernel with σ = 0.25, λ = 8 10 7 ...); a9a (... Gaussian kernel with σ = 10, λ = 1 10 5 ...); CIFAR (... Gaussian kernel with σ = 10, λ = 2 10 6 ...)