reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Synergy of Monotonic Rules

Authors: Vladimir Vapnik, Rauf Izmailov

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We selected the following 9 calibration data sets from UCI Machine Learning Repository (Lichman (2013)): Covertype, Adult, Tic-tac-toe, Diabetes, Australian, Spambase, MONK s-1, MONK s-2, and Bank marketing. Our selection of these speciﬁc data sets was driven by the desire to ensure statistical reliability of targeted estimates, which translated into availability of relatively large test data set (containing at least 150 samples). Speciﬁc breakdowns for the corresponding training and test sets are listed in Table 1. For each of these 9 data sets, we constructed 10 random realizations of training and test data sets; for each of these 10 realizations, we trained three SVMs with diﬀerent kernels: with RBF kernel, with INK-Spline kernel, and with linear kernel. The averaged test errors of the constructed SVMs are listed in Table 1. [...] Table 2: Synergy of SVMs with RBF, INK-spline, and linear kernels.
Researcher Affiliation	Collaboration	Vladimir Vapnik EMAIL Columbia University New York, NY 10027, USA Facebook AI Research New York, NY 10017, USA Rauf Izmailov EMAIL Applied Communication Sciences Basking Ridge, NJ 07920-2021, USA
Pseudocode	No	The paper describes methods of estimating monotonic conditional probability functions and their applications in detail, but it does not present them in structured pseudocode or algorithm blocks. The procedures are described using mathematical equations and textual explanations.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code, nor does it include links to code repositories.
Open Datasets	Yes	We selected the following 9 calibration data sets from UCI Machine Learning Repository (Lichman (2013)): Covertype, Adult, Tic-tac-toe, Diabetes, Australian, Spambase, MONK s-1, MONK s-2, and Bank marketing.
Dataset Splits	Yes	Speciﬁc breakdowns for the corresponding training and test sets are listed in Table 1. For each of these 9 data sets, we constructed 10 random realizations of training and test data sets; for each of these 10 realizations, we trained three SVMs with diﬀerent kernels: with RBF kernel, with INK-Spline kernel, and with linear kernel. Table 1: Data set Training Test Features Covertype 300 3000 54 Adult 300 26147 123 Tic-tac-toe 300 658 27 Diabetes 576 192 8 Australian 517 173 14 Spambase 300 4301 57 MONK s-1 124 432 6 MONK s-2 169 432 6 Bank 300 4221 16
Hardware Specification	No	The paper does not provide any specific details about the hardware (GPU, CPU, or memory) used to run the experiments.
Software Dependencies	No	The paper describes algorithms and mathematical formulations, but it does not list any specific software dependencies or versions used for implementation.
Experiment Setup	No	The paper discusses the use of different SVM kernels (RBF, INK-Spline, linear) and methods for estimating conditional probabilities, but it does not specify concrete hyperparameters or system-level training settings for these SVMs or the proposed synergy method.