reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bipartite Ranking From Multiple Labels: On Loss Versus Label Aggregation

Authors: Michal Lukasik, Lin Chen, Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Felix X. Yu, Sashank J. Reddi, Gang Fu, Mohammadhossein Bateni, Sanjiv Kumar

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a suite of synthetic and real-world experiments to empirically validate our theoretical ﬁndings.
Researcher Affiliation	Industry	1Google Research. Correspondence to: Michal Lukasik <EMAIL>, Lin Chen <EMAIL>.
Pseudocode	No	The paper describes methods and analyses them, but does not provide any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about providing source code or a link to a code repository.
Open Datasets	Yes	We consider the UCI Banking dataset composed of information about Bank customers, advertising campaign details, and the success thereof (Moro et al., 2014). Help Steer (Wang et al., 2023b) consists of evaluations of LLM responses across 5 categories: MSLR. We next consider MSLR Web30k, a dataset of users query-document interactions (Qin & Liu, 2013).
Dataset Splits	Yes	We train a 3-layer MLP model with hidden dimension 256 and Re LU activation over 8K examples for 50 epochs. We evaluate on held out 2K examples.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions using the 'TensorFlow ranking library' but does not specify any software versions for libraries or programming languages.
Experiment Setup	Yes	We train a 3-layer MLP model with hidden dimension 256 and Re LU activation over 8K examples for 50 epochs. We train a linear model on numerical features using Adam for 100 epochs.