reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Monotonic Probabilities with a Generative Cost Model

Authors: Yongxiang Tang, Yanhua Cheng, Xiaocheng Liu, Chenchen Jiao, Yanxiang Zeng, Ning Luo, Pengjia Yuan, Xialong Liu, Peng Jiang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further validate our approach with a numerical simulation of quantile regression and conduct multiple experiments on public datasets, showing that our method significantly outperforms existing monotonic modeling techniques.
Researcher Affiliation	Industry	1Kuaishou, Beijing, China. Correspondence to: Yongxiang Tang <EMAIL>, Yanhua Cheng <EMAIL>.
Pseudocode	No	The paper describes the methodology using mathematical formulations and textual descriptions rather than structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code for our experiments can be found at https://github.com/tyxaaron/GCM.
Open Datasets	Yes	To further evaluate the monotonic problem with a multivariate revenue variable r, we conduct experiments on six publicly available datasets: the Adult dataset (Becker & Kohavi, 1996), the COMPAS dataset (Larson et al., 2016), the Diabetes dataset (Teboul), the Blog Feedback dataset (Buza, 2014), the Loan Defaulter dataset and the Auto MPG dataset (Quinlan, 1993).
Dataset Splits	Yes	The datasets are divided into training and testing sets in a 4 : 1 ratio.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions that network parameters are optimized using the stochastic gradient descent algorithm and refers to ADAM (Kingma & Ba, 2015) as an optimizer, but does not provide specific version numbers for software libraries or dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	Table 5 (Hyperparameters of the experiments) lists specific hyperparameters for each dataset, including hidden dimension, sample number, latent dimension, max epoch, optimizer, batch size, and learning rate. For example, for the Adult dataset, it specifies a batch size of 256, learning rate of 0.001, and a max epoch of 40.