reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Paid with Models: Optimal Contract Design for Collaborative Machine Learning

Authors: Bingchen Wang, Zhaoxuan Wu, Fusheng Liu, Bryan Kian Hsiang Low

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct a detailed analysis of the properties that an optimal contract must satisfy when models serve as the rewards, and we explore the potential benefits and welfare implications of these contractdriven CML schemes through numerical experiments. ... We illustrate the potential and the welfare implications of optimally designed contracts through numerical experiments, showing that, inter alia, it could help small parties surmount the cost barrier of model training and reaping the reward of emergent technologies. ... 6 Experiments To gain numerical insights into optimal contract design, we conduct a series of experiments with specified forms of the accuracy function and the valuation function. ... Figure 3 shows the simulation results for the three scenarios.
Researcher Affiliation	Academia	1Institute of Data Science, National University of Singapore 2Singapore-MIT Alliance for Research and Technology, Republic of Singapore 3Department of Computer Science, National University of Singapore EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes mathematical formulations and theoretical analysis for optimal contract design but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository or mention code in supplementary materials.
Open Datasets	No	The paper uses specified forms of accuracy and valuation functions for numerical experiments and defines per-unit data costs for simulation scenarios. It does not use external, publicly available datasets in the traditional sense, but rather a simulated environment based on mathematical functions and parameters. Therefore, no concrete access information for a public dataset is provided.
Dataset Splits	No	The paper conducts numerical experiments and simulations based on specified functions and parameters, rather than using traditional datasets that would require training/test/validation splits. Therefore, the concept of dataset splits is not applicable in this context.
Hardware Specification	No	The paper discusses 'GPUs' conceptually as resources contributed by parties in Collaborative Machine Learning, but it does not specify any hardware (like specific GPU/CPU models, processors, or memory) used to run the numerical experiments or simulations presented in the paper.
Software Dependencies	No	The paper mentions that the optimization problem can be solved by 'numerical optimization methods, such as the trust-region interior-point algorithm', citing Nocedal and Wright (2006). However, it does not specify any particular software packages, libraries, or their version numbers used for these methods.
Experiment Setup	Yes	Following Karimireddy, Guo, and Jordan (2022), we adopt the standard generalization bound (Mohri, Rostamizadeh, and Talwalkar 2018) as the accuracy function, expressed as follows: a(m) := max { 2k(2 + log(m/k)) + 4 m, 0} where m measures the quantity of data used for model training; aopt is the optimal accuracy achievable by the model, and k captures the difficulty of the learning task. We set k = 1 and aopt = 1 for the experiments. We assume a constant return to the model accuracy, v(x) := 100x, so a model with perfect accuracy is worth 100 in monetary terms. ... We specify the per-unit data cost for the high-cost type as c1 = 0.02 and that for the low-cost type as c2 = 0.01. ... for varied probability of high-cost type p1 [0, 1] and total number of participants N [2, 100] ... we set N = 10 and I = 5, pi = 0.2, i I but vary the private costs c. We consider three different scenarios: 1) all types find it in their interest to train a model on their own, with c = {0.2, 0.16, 0.12, 0.08, 0.04}; 2) all types would not train a model on their own due to high perunit costs, with c = {1, 0.85, 0.7, 0.55, 0.4}; 3) some types would train the model on their own and others would not, with c = {0.5, 0.4, 0.03, 0.02, 0.001}.