reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Mathematical Framework for AI-Human Integration in Work

Authors: L. Elisa Celis, Lingxiao Huang, Nisheeth K. Vishnoi

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a mathematical framework that models jobs, workers, and worker-job fit, introducing a novel decomposition of skills into decision-level and action-level subskills to reflect the complementary strengths of humans and Gen AI. We analyze how changes in subskill abilities affect job success, identifying conditions for sharp transitions in success probability. We also establish sufficient conditions under which combining workers with complementary subskills significantly outperforms relying on a single worker. This explains phenomena such as productivity compression, where Gen AI assistance yields larger gains for lower-skilled workers. We demonstrate the framework s practicality using data from O*NET and Big-bench Lite, aligning real-world data with our model via subskill-division methods. (Abstract)
Researcher Affiliation	Academia	1Yale University, USA. 2State Key Laboratory of Novel Software Technology, New Cornerstone Science Laboratory, Nanjing University, China. Correspondence to: Nisheeth K. Vishnoi <EMAIL>.
Pseudocode	No	The paper describes methodologies in prose and mathematical equations. There are no explicitly labeled pseudocode or algorithm blocks, nor any structured, code-like procedures presented.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets	Yes	A key resource we draw on is the Occupational Information Network (O*NET) (U.S. Department of Labor, Employment and Training Administration, 2023), a comprehensive database maintained by the U.S. Department of Labor that provides standardized descriptions of thousands of jobs.
Dataset Splits	No	The paper demonstrates the application of a mathematical framework to existing datasets (O*NET, Big-bench Lite) and describes data processing methods like subskill division using GPT-4o. However, it does not specify traditional machine learning dataset splits (e.g., training, validation, test sets) as it does not involve training a new model.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, cloud resources, or memory specifications) used for running its analyses or simulations. While GPT-4o is mentioned for data processing tasks, it does not refer to the authors' experimental hardware.
Software Dependencies	No	The paper mentions GPT-4o as a tool for data processing but does not specify its version. No other specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow, specific solvers) are listed for the implementation or simulation of the mathematical framework.
Experiment Setup	Yes	Choice of error functions and threshold. We set the skill error function as h(ζ1, ζ2) = ζ1 + ζ2... Task and job error functions, g and f, are weighted averages... We set the threshold τ = 0.45...