Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Unsupervised Evaluation and Weighted Aggregation of Ranked Classification Predictions
Authors: Mehmet Eren Ahsen, Robert M Vogel, Gustavo A Stolovitzky
JMLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the performance of SUMMA using a synthetic example as well as two real world problems. Keywords: Ensemble learning, Ensemble classifier, Unsupervised Learning, AUC, Spectral Decomposition (...) In this section we apply the SUMMA methodology and assess its performance in different example settings, including i) synthetic data, ii) predictions submitted to a crowd-sourced challenge and iii) several classification problems in different domains using datasets available from the UCI Machine Learning Repository (Lichman 2013). |
| Researcher Affiliation | Collaboration | Mehmet Eren Ahsen1,2 EMAIL Robert M Vogel2,3 EMAIL Gustavo A Stolovitzky2,3 EMAIL 1University of Illinois at Urbana-Champaign Department of Business Administration, 1206 S 6th St, Champaign, IL 61820, USA 2Icahn School of Medicine at Mount Sinai Department of Genetics and Genomic Sciences One Gustave Levy Place, Box 1498 New York, NY , USA 3IBM T.J. Watson Research Center 1101 Kitchawan Road, Route 134, Yorktown Heights, N.Y., 10598, USA. |
| Pseudocode | Yes | Algorithm 1 Find rank 1 matrix from off-diagonal observations of covariance matrix (...) Algorithm 2 Find rank 1 matrix from off-diagonal observations of covariance tensor |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | We also apply SUMMA to two datasets taken from real applications (...) In this section we apply the SUMMA methodology and assess its performance in different example settings, including i) synthetic data, ii) predictions submitted to a crowd-sourced challenge and iii) several classification problems in different domains using datasets available from the UCI Machine Learning Repository (Lichman 2013). |
| Dataset Splits | Yes | With the exception of the Bank Marketing data, we divided each dataset into half, and used the first half to train the base classifiers and the second half to evaluate SUMMA. For the Bank Marketing data, which had 45,211 samples, we randomly selected 1000 samples for training and another 1000 samples for the evaluation. |
| Hardware Specification | Yes | For the computations in this section we have used the R language on a personal laptop which has 4 computational cores and 16GB of RAM. |
| Software Dependencies | No | For the computations in this section we have used the R language on a personal laptop (...) We trained base classifiers using the R package caret (Kuhn et al., 2008b), which we chose for its ease of use, its inclusion of large diversity of popular classifiers, and its automatic layout for doing cross-validation (Kuhn et al., 2008a). |
| Experiment Setup | Yes | We chose M = 22 base classifiers as shown in (Table 3), and used ten-fold cross validation for their training. |