reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unsupervised Evaluation and Weighted Aggregation of Ranked Classification Predictions

Authors: Mehmet Eren Ahsen, Robert M Vogel, Gustavo A Stolovitzky

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the performance of SUMMA using a synthetic example as well as two real world problems. Keywords: Ensemble learning, Ensemble classiﬁer, Unsupervised Learning, AUC, Spectral Decomposition (...) In this section we apply the SUMMA methodology and assess its performance in diﬀerent example settings, including i) synthetic data, ii) predictions submitted to a crowd-sourced challenge and iii) several classiﬁcation problems in diﬀerent domains using datasets available from the UCI Machine Learning Repository (Lichman 2013).
Researcher Affiliation	Collaboration	Mehmet Eren Ahsen1,2 EMAIL Robert M Vogel2,3 EMAIL Gustavo A Stolovitzky2,3 EMAIL 1University of Illinois at Urbana-Champaign Department of Business Administration, 1206 S 6th St, Champaign, IL 61820, USA 2Icahn School of Medicine at Mount Sinai Department of Genetics and Genomic Sciences One Gustave Levy Place, Box 1498 New York, NY , USA 3IBM T.J. Watson Research Center 1101 Kitchawan Road, Route 134, Yorktown Heights, N.Y., 10598, USA.
Pseudocode	Yes	Algorithm 1 Find rank 1 matrix from oﬀ-diagonal observations of covariance matrix (...) Algorithm 2 Find rank 1 matrix from oﬀ-diagonal observations of covariance tensor
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We also apply SUMMA to two datasets taken from real applications (...) In this section we apply the SUMMA methodology and assess its performance in diﬀerent example settings, including i) synthetic data, ii) predictions submitted to a crowd-sourced challenge and iii) several classiﬁcation problems in diﬀerent domains using datasets available from the UCI Machine Learning Repository (Lichman 2013).
Dataset Splits	Yes	With the exception of the Bank Marketing data, we divided each dataset into half, and used the ﬁrst half to train the base classiﬁers and the second half to evaluate SUMMA. For the Bank Marketing data, which had 45,211 samples, we randomly selected 1000 samples for training and another 1000 samples for the evaluation.
Hardware Specification	Yes	For the computations in this section we have used the R language on a personal laptop which has 4 computational cores and 16GB of RAM.
Software Dependencies	No	For the computations in this section we have used the R language on a personal laptop (...) We trained base classiﬁers using the R package caret (Kuhn et al., 2008b), which we chose for its ease of use, its inclusion of large diversity of popular classiﬁers, and its automatic layout for doing cross-validation (Kuhn et al., 2008a).
Experiment Setup	Yes	We chose M = 22 base classiﬁers as shown in (Table 3), and used ten-fold cross validation for their training.