reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Competition over data: how does data purchase affect users?

Authors: Yongchan Kwon, Tony A Ginart, James Zou

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we propose a general competition environment and study what happens when competing ML predictors can actively acquire user data. Our main contributions are as follows. We propose a novel environment that can simulate various real-world competitions. Our environment allows ML predictors to use AL algorithms to purchase labeled data within a finite budget while competing against each other (Section 2). Surprisingly, our results show that when competing ML predictors purchase data, the quality of the predictions selected by each user can decrease even as competing ML predictors get better (Section 3.1). We demonstrate that data purchase makes competing predictors similar to each other, leading to this counterintuitive finding (Section 3.2). Our finding is robust and is consistently observed across various competition situations (Section 3.3).
Researcher Affiliation	Academia	Yongchan Kwon EMAIL Department of Statistics, Columbia University Tony Ginart EMAIL Department of Electrical Engineering, Stanford University James Zou EMAIL Department of Biomedical Data Science, Stanford University
Pseudocode	Yes	Environment 1 A competition environment with data purchase Input: Number of competition rounds T; user distribution PX,Y ; number of predictors M; competing predictors C(i) = (n(i) s , n(i) b , f (i), π(i)) for i [M]. Procedure: For all i [M], a model f (i) is trained using the n(i) s seed data points for t [T] do (Xt, Yt) from PX,Y is drawn and a set of buyers B = is initialized for i [M] do if (n(i) b 1) and (π(i)(Xt) = 1) then B B {C(i)} else Predict f (i)(Xt) end if end for if \|B\| 1 then A user selects one predictor Wt from B uniformly at random n(Wt) b n(Wt) b 1 else A user selects one predictor Wt based on (1) end if C(Wt) receives a user label Yt and updates f (Wt)
Open Source Code	Yes	Our Python-based implementations are available at https://github.com/ ykwon0407/data_purchase_in_comp.
Open Datasets	Yes	Our experiments consider the seven real datasets to describe various user distributions PX,Y , namely Insurance (Van Der Putten & van Someren, 2000), Adult (Dua & Graff, 2017), Postures (Gardner et al., 2014), Skin-nonskin (Chang & Lin, 2011), MNIST (Le Cun et al., 2010), Fashion-MNIST (Xiao et al., 2017), and CIFAR10 (Krizhevsky et al., 2009) datasets.
Dataset Splits	Yes	For all datasets, we first split a dataset into competition and evaluation datasets: the competition dataset is used during the T = 104 competition rounds and the evaluation dataset is used for evaluate metrics after the competition. For Fashion MNIST, MNIST, and CIFAR10, we use the original training and test datasets for competition and evaluation datasets, respectively. For Insurance, Adult, Postures, and Skin-nonskin, we randomly sample 5000 data points from the original dataset to make the evaluation dataset and use the remaining data points as the competition dataset.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments. It mentions using Adam optimization and specific model types but no GPU, CPU, or cloud resource details.
Software Dependencies	No	The paper mentions using "Adam optimization (Kingma & Ba, 2014)" and "Python-based implementations" but does not specify version numbers for Python or any libraries like PyTorch or TensorFlow.
Experiment Setup	Yes	Throughout the experiments, the total number of competition rounds is T = 104, the number of predictors is M = 18, and a quality function is the correctness function, i.e., q(Y1, Y2) = 1({Y1 = Y2}) for all Y1, Y2 Y. We set the number for seed data points n(i) s between 50 and 200 depending on the user dataset. We use either a logistic model or a neural network model with one hidden layer for f (i). As for the buying policy, we use a standard entropy-based AL rule for π(i) (Settles & Craven, 2008). We consider various competition situations by varying the budget nb {0, 100, 200, 400} and the temperature parameter α {0, 1, 2, 4}. We use the Adam optimization (Kingma & Ba, 2014) with the specified learning rate and epochs. The batch size is fixed to 64. If an predictor is selected, then its ML model is updated with one iteration with the newly obtained data point, and we retrain the model whenever the retrain period new samples are obtained. Table 2 provides a summary of hyperparameters by datasets.