Data-driven Rank Breaking for Efficient Rank Aggregation

Authors: Ashish Khetan, Sewoong Oh

JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Figure 2, we verify the scaling of the resulting error via numerical simulations. ... In Figure 2, we verify the scaling of the resulting error via numerical simulations. We fix d = 1024 and κj = κ = 128, and vary the number of separators ℓj = ℓfor fixed n = 128000 (left), and vary the number of samples n for fixed ℓj = ℓ= 16 (middle). Each point is average over 100 instances. The plot confirms that the mean squared error scales as 1/(ℓn). ... On real-world data sets on sushi preferences (Kamishima, 2003), we show that the data-driven rank-breaking improves over Generalized Method-of-Moments (GMM) proposed by Azari Soufiani et al. (2013). ... Figure 9: The data-driven rank-breaking achieves smaller error compared to the state-of-the-art GMM approach.
Researcher Affiliation Academia Ashish Khetan EMAIL Sewoong Oh EMAIL Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana-Champaign Urbana, IL 61801, USA
Pseudocode No The paper describes the 'Consistent rank-breaking estimator' by providing mathematical formulas for the convex optimization problem (Equations 2 and 3). It also mentions 'the algorithm described in (33)'. However, it does not present any of these in a clearly labeled or structured pseudocode or algorithm block, but rather through textual descriptions and mathematical expressions.
Open Source Code No The paper mentions using 'Hunter (2004) implementation of the ML estimator' and compares performance with 'GMM algorithm' (Azari Soufiani et al. (2013)). These refer to third-party tools or methods by other researchers. The authors do not state that they are releasing their own code for the methodology described in this paper, nor do they provide any links to a code repository.
Open Datasets Yes On real-world data sets on sushi preferences (Kamishima, 2003), we show that the data-driven rank-breaking improves over Generalized Method-of-Moments (GMM) proposed by Azari Soufiani et al. (2013). ... We compare our algorithm with the GMM algorithm on two other real-world data-sets as well. We use jester data set (Goldberg et al., 2001)... We perform similar experiments on American Psychological Association (APA) data-set (Diaconis, 1989).
Dataset Splits No The paper describes how partial rankings were simulated from existing full datasets (e.g., 'removing the known ordering among bottom 10 ℓalternatives for each sample' or 'simulating random-ℓ separators'). It also states 'Each point is average over 100 instances', indicating repeated experiments. However, it does not provide specific train/test/validation dataset splits (e.g., percentages, sample counts, or references to predefined splits for model training and evaluation).
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions 'Hunter (2004) implementation of the ML estimator' for computing estimates and 'Generalized Method-of-Moments (GMM) proposed by Azari Soufiani et al. (2013)'. However, it does not list any specific software or library names with version numbers that would be necessary to reproduce the experimental environment.
Experiment Setup Yes In Figure 2, we verify the scaling of the resulting error via numerical simulations. We fix d = 1024 and κj = κ = 128, and vary the number of separators ℓj = ℓfor fixed n = 128000 (left), and vary the number of samples n for fixed ℓj = ℓ= 16 (middle). Each point is average over 100 instances. ... We fix d = 1024 items and the underlying preference vector θ is uniformly distributed over [ b, b] for b = 2. We generate n rankings over sets Sj of size κ for j [n] according to the PL model with parameter θ .