reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Generalized Rank-Breaking: Computational and Statistical Tradeoffs

Authors: Ashish Khetan, Sewoong Oh

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide extensive numerical results on simulated and real-world datasets conﬁrming our theoretical results and performance gains of the generalized rank-breaking algorithm over the pairwise rank-breaking algorithm. Our numerical experiments show that the dependence of MSE on n, d, κj, rj,a, mj,a, ℓj as given in Theorem 11, Equation (15) holds true, even when the conditions for the theorem to hold are not met. In the third panel, the GRB with M = 3 achieves decreasing MSE, whereas for PRB the increased bias dominates the MSE. We provide numerical experiment under this canonical setting in Figure 4 (left) with d = 256 and M {1, 2, 3, 4, 5}, illustrating the trade-oﬀin practice.
Researcher Affiliation	Academia	Ashish Khetan EMAIL Sewoong Oh EMAIL Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana-Champaign Urbana, IL 61801, USA
Pseudocode	Yes	Algorithm 1 Finding Maximal Ordered Partition Algorithm 2 Constructing Rank-Breaking Graph Algorithm 3 Estimate θ given DAG Gj s.
Open Source Code	No	The paper discusses algorithmic solutions and their properties but does not provide any specific links to source code, nor does it state that the code for their methodology is released or available in supplementary materials.
Open Datasets	Yes	On sushi preferences (Kamishima, 2003) and jester dataset (Goldberg et al., 2001), we improve over pairwise breaking and achieve same performance as the oracle MLE. Sushi dataset. There are d = 100 types of sushi. Full rankings over subsets Sj of size κ = 10 are provided by n = 5000 individuals. Jester dataset. It consists of continuous ratings between −10 to +10 of 100 jokes on sets of size κ, 36 ≤ κ ≤ 100, by 24,983 users.
Dataset Splits	No	The paper describes using `n` (number of users/samples) and `d` (number of items) for simulated data, and for real-world datasets like Sushi and Jester, it specifies `n` and `κ` (size of subsets). For example, it states: "Full rankings over subsets Sj of size κ = 10 are provided by n = 5000 individuals" and "The jester dataset which has d = 100, n = 24,983, and 36 ≤ κj ≤ 100." However, it does not explicitly provide details about training/testing/validation splits or cross-validation setup for these datasets, which are necessary for reproduction.
Hardware Specification	No	The paper does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using "oﬀ-the-shelf convex optimization tools" in Section 2, but it does not specify any software libraries or tools with their version numbers.
Experiment Setup	Yes	Require: DAG {Gj}1≤j≤n generated under PL model with parameter θ∗, rank-breaking order M, error threshold ϵ Ensure: bθ an estimate of θ∗ ... In the third panel, the GRB with M = 3 achieves decreasing MSE, whereas for PRB the increased bias dominates the MSE. The PL weights are chosen uniformly spaced over [−2, 2].