reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fast and Robust Rank Aggregation against Model Misspecification

Authors: Yuangang Pan, Ivor W. Tsang, Weijie Chen, Gang Niu, Masashi Sugiyama

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the end, we apply Coarsen Rank on four real-world data sets. Experiments show that Coarsen Rank is fast and robust, achieving consistent improvements over baseline methods.
Researcher Affiliation	Academia	Center for Frontier AI Research Research Agency for Science, Technology and Research (A*STAR) Singapore and Australian Artiﬁcial Intelligence Institute University of Technology Sydney NSW 2007, Australia, Zhijiang College Zhejiang University of Technology Hangzhou 310014, Zhejiang, China, Center for Advanced Intelligence Project RIKEN, Tokyo, 103-0027, Japan, Graduate School of Frontier Sciences University of Tokyo Chiba 277-8561, Japan
Pseudocode	Yes	Algorithm 1 Closed Form EM for Coarsened Rank Aggregation (Coarsen Rank), Algorithm 2 Gibbs Sampling for Coarsened Rank Aggregation (Coarsen Rank)
Open Source Code	No	The text does not contain an explicit statement about the release of source code or a direct link to a code repository.
Open Datasets	Yes	The Readlevel data set (Chen et al., 2013), The SUSHI data set is introduced in (Kamishima, 2003), The Baby Face data set (Han et al., 2018), The Peer Grading data set (Sajjadi et al., 2016)
Dataset Splits	No	The paper mentions generating training data for the SUSHI dataset by replacing preferences but does not provide explicit training/validation/test splits or reproducible splitting methodology for any of the datasets used.
Hardware Specification	Yes	Empirical analyses were performed on an Intel i5 processor(2.30 GHz) and 8 GB random-access memory (RAM).
Software Dependencies	No	The paper describes algorithms and models but does not explicitly list specific software dependencies with version numbers.
Experiment Setup	Yes	We ﬁxed 𝐶= 𝑀/2 in our experiment for simplicity. The calibration method is applied to all EM-based approaches, i.e., Coarsen BT, Coarsen PL, and PL-EM. Following Crowd BT, we use virtual node regularization (Chen et al., 2013). The number of samplings is set to 50 in our experiment. For the sake of fair comparison, the inner iteration is ﬁxed to 15 for all methods.