reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence

Authors: Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin J. Wainwright

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The paper includes sections like "4.2 Experiments and simulations", "4.2.1 Experiments on synthetic data", and "4.2.2 Experiments on MTurk", and presents figures (e.g., "Figure 2: Simulation results under the Thurstone model", "Figure 3: Estimation error under different topologies for different generative processes in the synthetic simulations") and tables (e.g., "Table 1: Comparison of the average amount of error when ordinal data is collected directly versus when cardinal data is collected and converted to ordinal.") demonstrating empirical evaluation and data analysis.
Researcher Affiliation	Academia	All authors are affiliated with universities: "University of California, Berkeley" and "Carnegie Mellon University". Their email addresses also correspond to academic domains (e.g., "eecs.berkeley.edu", "stat.cmu.edu").
Pseudocode	No	The paper describes mathematical models and theoretical analyses but does not include any explicitly labeled pseudocode or algorithm blocks. Methods are presented in prose and mathematical equations.
Open Source Code	No	The paper states: "Our implementation, which is not optimized for speed, employs the sequential least squares programming subroutine of the Scipy package in the Python programming language." This refers to the use of a third-party library, not the release of the authors' own source code for their methodology. There is no explicit statement or link for the code.
Open Datasets	Yes	The paper states: "The complete dataset pertaining to these experiments is available on the ﬁrst author s website." and "The entire data pertaining to these experiments, including the interface seen by the workers and the data obtained from their work, is available on the ﬁrst author s website." This indicates that the datasets generated from their experiments are made publicly available.
Dataset Splits	Yes	For the MTurk experiments, the paper states: "Each run of the estimation procedure employs the data provided by ﬁve randomly chosen workers from the pool of workers who performed that task." and "the estimator is supplied the best-ﬁtting value of σ obtained via 3-fold cross-validation." This describes specific splitting and sampling strategies.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running experiments. It only mentions the programming language and libraries used for implementation.
Software Dependencies	No	The paper mentions: "Our implementation, which is not optimized for speed, employs the sequential least squares programming subroutine of the Scipy package in the Python programming language." However, it does not specify version numbers for either Scipy or Python.
Experiment Setup	Yes	The paper provides specific details regarding the experimental setup. For synthetic data, it states: "The value of σ and B are both fixed to be 1. Given the n samples, inference is performed via the maximum likelihood estimator for the Thurstone model. Each point in the plots is an average of 20 such trials." For MTurk experiments: "Each run of the estimation procedure employs the data provided by ﬁve randomly chosen workers from the pool of workers who performed that task." and "the estimator is supplied the best-ﬁtting value of σ obtained via 3-fold cross-validation."