reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Agnostic Pointwise-Competitive Selective Classification

Authors: Yair Wiener, Ran El-Yaniv

JAIR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We thus consider a heuristic approximation procedure that is based on SVMs, and show empirically that this algorithm consistently outperforms a traditional rejection mechanism based on distance from decision boundary. ... In Section 7 we present some numerical examples over medical classiﬁcation problems and examine the empirical performance of the new algorithm and compare its performance with that of the widely used selective classiﬁcation method for rejection, based on distance from decision boundary.
Researcher Affiliation	Academia	Yair Wiener EMAIL Ran El-Yaniv EMAIL Computer Science Department Technion Israel Institute of Technology Haifa 32000, Israel
Pseudocode	Yes	Strategy 1 Agnostic low-error selective strategy (LESS) Input: Sm, m, δ, d Output: a pointwise-competitive selective classiﬁer (h, g) w.p. 1 δ 1: Set ˆf = ERM(F, Sm), i.e., ˆf is any empirical risk minimizer from F w.r.t. Sm 2: Set G = ˆV ˆf, 2σ(m, δ/4, d) (see Eq. (2) and (3)) 3: Construct g such that g(x) = 1 x {X \ DIS (G)}
Open Source Code	No	For implementation we used LIBSVM (Chang & Lin, 2011). We tested our algorithm on standard medical diagnosis problems from the UCI repository, including all datasets used by Grandvalet, Rakotomamonjy, Keshet, and Canu (2008). ... Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm .
Open Datasets	Yes	We tested our algorithm on standard medical diagnosis problems from the UCI repository, including all datasets used by Grandvalet, Rakotomamonjy, Keshet, and Canu (2008).
Dataset Splits	Yes	In each iteration we choose uniformly at random non-overlapping training set (100 samples) and test set (200 samples) for each dataset.6
Hardware Specification	No	The paper does not provide specific hardware details (like CPU/GPU models, memory, etc.) used for running the experiments. It only discusses the software and datasets used.
Software Dependencies	No	For implementation we used LIBSVM (Chang & Lin, 2011).
Experiment Setup	Yes	Using support vector machines (SVMs) we use a high C value (105 in our experiments) to penalize more on training errors than on small margin (see deﬁnitions of the SVM parameters in, e.g. Chang & Lin, 2011). In order to estimate ˆR( fx) we have to restrict the SVM optimizer to only consider hypotheses that classify the point x in a speciﬁc way. To accomplish this we use a weighted SVM for unbalanced data. We add the point x as another training point with weight 10 times larger than the weight of all training points combined. ... First we generate an odd number k of diﬀerent samples (S1 m, S2 m, . . . Sk m) using bootstrap sampling (we used k = 11).