reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Decision Process for Cost-Efficient Dynamic Ranking via Crowdsourcing

Authors: Xi Chen, Kevin Jiao, Qihang Lin

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental evaluations on both synthetic and real data show that the proposed policy achieves high ranking accuracy with a lower labeling cost. Keywords: crowdsourced ranking, Bayesian, Markov decision process, dynamic programming, knowledge gradient, moment matching
Researcher Affiliation	Academia	Xi Chen EMAIL Stern School of Business New York University New York, New York, 10012, USA; Kevin Jiao EMAIL Stern School of Business New York University New York, New York, 10012, USA; Qihang Lin EMAIL Tippie College of Business University of Iowa Iowa City, Iowa, 52242, USA
Pseudocode	Yes	Algorithm 1: Approximated Knowledge Gradient Policy with Homogeneous Workers Algorithm 2: Approximated Knowledge Gradient Policy with Heterogeneous Workers
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include links to a code repository.
Open Datasets	Yes	We now apply the proposed AKG policy (Algorithm 2) to a real dataset on reading diﬃculty levels (Collins-Thompson and Callan, 2004).
Dataset Splits	No	The paper describes generating data for simulated studies and the total amount of available pairwise comparisons for the real dataset, but it does not specify explicit training, validation, or test splits for reproduction, as the problem is framed as an active sampling process rather than a static dataset split for model training and evaluation.
Hardware Specification	No	The paper mentions computation time for different scenarios and algorithms but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers for any libraries, frameworks, or programming languages used in the implementation of the algorithms.
Experiment Setup	Yes	We set the prior of θ to be the uniform distribution on the simplex (i.e., α0 is set to be an all-one vector). ... The parameter γ, which balances the exploitation-exploration trade-oﬀin Chen et al. (2013), is set to 1 in this experiment. ... We set the prior of θ to be the uniform distribution on the simplex (i.e., α0 is set to be an all-one vector) and choose µ0 w = 4, ν0 w = 1 for each worker w = 1, 2, . . . , M. ... We run experiments in two diﬀerent settings. The ﬁrst one assumes that all workers are homogeneous and fully reliable. ... The second experiment incorporates the heterogeneous reliability of workers.