reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Ensemble Methods for Structured Prediction

Authors: Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri

ICML 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also report the results of extensive experiments with these algorithms in several structured prediction tasks.
Researcher Affiliation	Collaboration	Corinna Cortes EMAIL Google Research, 111 8th Avenue, New York, NY 10011 Vitaly Kuznetsov EMAIL Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, NY 10012 Mehryar Mohri EMAIL Courant Institute and Google Research, 251 Mercer Street, New York, NY 10012
Pseudocode	Yes	Algorithm 1 WMWP algorithm. Inputs: sample {(x1, y1), . . . , (x T , y T )}; set of experts {h1, . . . , hp}; parameter β (0, 1). for j = 1 to p and k = 1 to l do... Algorithm 2 ESPBoost Algorithm. Inputs: S = ((x1, y1), . . . , (xm, ym)); set of experts {h1, . . . , hp}. for i = 1 to m and k = 1 to l do...
Open Source Code	No	The paper mentions and links to several third-party software packages used (e.g., CRFsuite, SVMstruct, Stanford Classifier), but it does not state that the authors' own code for the described methodology is open-source or provide a link to it.
Open Datasets	Yes	Rob Kassel s OCR data set is available for download from http://ai.stanford.edu/ btaskar/ocr/.; The Penn Treebank 2 data set is available through LDC license at http://www.cis.upenn.edu/ treebank/ and contains 251,854 sentences with a total of 6,080,493 tokens and 45 different parts of speech.
Dataset Splits	Yes	For each data set, we performed 10-fold cross-validation with disjoint training sets.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud computing instances) used for running the experiments.
Software Dependencies	No	The paper mentions several software packages used (e.g., 'CRFsuite', 'SVMstruct', 'Stanford Classiﬁer') with citations, but it does not specify their version numbers.
Experiment Setup	Yes	More details on the data set and the experimental parameters can be found in Appendix H.1. Table 1. Average Normalized Hamming Loss, ADS1 and ADS2. βADS1 = 0.95, βADS2 = 0.95, TSLE = 100, δ = 0.05. Table 2. Average Normalized Hamming Loss, PDS1 and PDS2. βPDS1 = 0.85, βPDS2 = 0.97, TSLE = 100, δ = 0.05. Table 3. Average Normalized Hamming Loss, TR1 and TR2. βTR1 = 0.95, βTR2 = 0.98, TSLE = 100, δ = 0.05.