reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Data-centric Machine Learning Research (DMLR) - 2025

Website:

Venue	Year	Papers	Reproducibility Score Reproducibility Score based on Gundersen et al. (2025)	Documentation Score Global mean is the average score over the seven reproducibility variables for empirical research papers.	% Empirical Percentage of papers that are empirical research vs theoretical research	% Industry Percentage of empirical research papers with at least one author from Industry	Website
DMLR	2025	13	0.76	4.55	84.62%	18.18%

Search Papers

	Pseudocode	Open Source Code	Open Datasets	Dataset Splits	Hardware Specification	Software Dependencies	Experiment Setup
Challenge design roadmap	❌	❌	✅	✅	✅	❌	✅	4
Chronicling Germany: An Annotated Historical Newspaper Dataset	❌	✅	✅	✅	✅	❌	✅	5
Constructing Confidence Intervals for “the” Generalization Error – a Comprehensive Benchmark Study	✅	✅	✅	✅	✅	✅	✅	7
Data Acquisition: A New Frontier in Data-centric AI	❌	✅	✅	❌	❌	❌	❌	2
Deep Learning for Accurate Diagnosis of Viral Infections through scRNA-seq Analysis: A Comprehensive Benchmark Study	❌	❌	✅	❌	❌	❌	❌	1
FlowBench: A Large Scale Benchmark for Flow Simulation over Complex Geometries	❌	✅	✅	✅	✅	❌	✅	5
MONSTER: Monash Scalable Time Series Evaluation Repository	❌	✅	✅	✅	✅	❌	✅	5
SuperBench: A Super-Resolution Benchmark Dataset for Scientific Machine Learning	❌	✅	✅	✅	✅	❌	✅	5
Synthetic Datasets for Machine Learning on Spatio-Temporal Graphs using PDEs	❌	✅	✅	✅	✅	✅	✅	6
Text Quality-Based Pruning for Efficient Training of Language Models	❌	❌	✅	✅	❌	❌	✅	3
The FIX Benchmark: Extracting Features Interpretable to eXperts	❌	✅	✅	✅	✅	❌	❌	4
Towards impactful challenges: post-challenge paper, benchmarks and other dissemination actions	❌	❌	❌	❌	❌	❌	❌	0
V-LoL: A Diagnostic Dataset for Visual Logical Learning	✅	✅	✅	✅	✅	✅	✅	7