reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A View of Margin Losses as Regularizers of Probability Estimates

Authors: Hamed Masnadi-Shirazi, Nuno Vasconcelos

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Various experiments provide evidence of the beneﬁts of probability regularization for both classiﬁcation and estimation of posterior class probabilities.
Researcher Affiliation	Academia	Hamed Masnadi-Shirazi EMAIL School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran Nuno Vasconcelos EMAIL Statistical Visual Computing Laboratory, University of California, San Diego La Jolla, CA 92039, USA
Pseudocode	Yes	Algorithm 1: Boost LR Input: Training set D = {(x1, y1), . . . , (xn, yn)}, where yi {1, 1} is the class label of example x, regularization gain σ, and number T of weak learners in the ﬁnal decision rule. Initialization: Set G(0)(xi)=0 and w(1)(xi) = 1 [f φσ] 1(yi G(0)(xi)) β φσ yi G(0)(xi) xi . for t = {1, . . . , T} do choose weak learner g (x) = arg max g(x) i=1 yiw(t)(xi)g(xi) update predictor G(x) G(t)(x) = G(t 1)(x) + g (x) update weights w(t+1)(xi) = 1 [f φσ] 1(yi G(t)(xi)) β φσ yi G(t)(xi) xi end for Output: decision rule h(x) = sgn[G(T )(x)].
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets	Yes	The next set of experiments used ten binary UCI data sets of relatively small size: (#1) sonar, (#2) breast cancer prognostic, (#3) breast cancer diagnostic, (#4) original Wisconsin breast cancer, (#5) Cleveland heart disease, (#6) tic-tac-toe, (#7) echo-cardiogram, (#8) Haberman s survival, (#9) Pima-diabetes, and (#10) liver disorder. ... To investigate the beneﬁts of loss regularization for larger data sets, we considered the ADULT, LETTER.p1 and LETTER.p2 data sets, which are widely used for comparing ensemble methods (Niculescu Mizil and Caruana, 2005; Caruana et al., 2004).
Dataset Splits	Yes	The very small sample regime, where the training set contained N = 5 examples per class, the moderate sample size regime, where N = 40 and the large sample regime, where N = 1, 000. Classiﬁers were learned with training sets of variable size and evaluated with a test set of 10, 000 examples. ... Each data set was split into ﬁve folds, four of which were used for training and one for testing. This created four train-test pairs per data set, over which the results were averaged. In all experiments, three of the four training folds were used for classiﬁer training and one as validation set for parameter selection. ... Missing values in the ADULT training and testing sets were omitted, leading to 30,162 training examples, of which 7,508 are positive and 22,654 negative. The test set consists of 15,060 examples, of which 3,700 are positive and 11,360 negative. The LETTER data was converted into two binary data sets (Caruana et al., 2004). ... Both datasets contain 4,000 training and 16,000 test examples. ... More precisely, the training set was subsampled by a factor of 2 (DIV2) and 4 (DIV4).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions various algorithms and methods like 'Boost LR', 'Gradient Boost', 'Ada Boost', 'Logit Boost', 'histogram-based weak learners', but does not specify any software libraries or tools with version numbers used for implementation.
Experiment Setup	Yes	Classiﬁers were learned with Boost LR under the three regimes, for a range of values of σ in the interval [0.5, 1000]. ... Boost LR was run for 50 iterations, using histogram-based weak learners and regularization gains σ [0.3, 500]. ... For both algorithms the regularization gain σ was crossvalidated among 10 values in [1, 10]. The α parameter of Boost LR was cross-validated among 5 values in [0, 1/2]. ... Each boosting algorithm was run for 100 iterations.