reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Model of Fake Data in Data-driven Analysis

Authors: Xiaofan Li, Andrew B. Whinston

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We fully solve the model and employ numerical examples to illustrate the players strategies and payoﬀs for insights. Specifically, our results show that maintaining some suspicion about the data sources and understanding that the sender can be strategic are very helpful to the data receiver. [...] In this section, we illustrate the fact that one cannot be perfectly accurate in detecting fake data by looking at each piece of data separately when the distributions of true and fake data have the same support, meaning that the data have strictly positive likelihoods in the distributions of both the true and fake data. We then use simulation data to show that our method addresses this problem.
Researcher Affiliation	Academia	Xiaofan Li EMAIL Andrew B. Whinston EMAIL Mc Combs School of Business The University of Texas at Austin 2110 Speedway Austin, TX 78705, USA
Pseudocode	No	The paper describes the model and its solution using mathematical equations and derivations, but it does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. There are no links to repositories, explicit code release statements, or mentions of code in supplementary materials.
Open Datasets	No	We use point processes to model the data traﬃc, where each piece of data can occur at any discrete moment in a continuous time ﬂow. [...] To generate simulation data, we use our model and set the parameters L = 3, c = 3, r = 0.1, Λ0 = 1, q(t0) = 0.1, p0 = 0.1, pa = 0.4. We look at the how the belief of both types receivers is updated in the ﬁrst 200 units of time. An example is shown in Figure 8. [...] The paper focuses on a theoretical model and generates its own simulation data; it does not use or provide access to external open datasets.
Dataset Splits	No	The paper develops a game-theoretic model and uses simulation data generated by the model itself, not external datasets that would require specific training/test/validation splits.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its numerical examples or simulations.
Software Dependencies	No	The paper does not specify any software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers that would be needed to replicate the experiments or simulations.
Experiment Setup	Yes	Specifically, we set L = 3, c = 3, r = 0.1, Λ0 = 1, p0 = 0.1, and compare the strategies and payoﬀs between cases where pa = 0.3 and pa = 0.4. [...] Speciﬁcally, we keep assuming L = 3, c = 3, r = 0.1, Λ0 = 1, p0 = 0.1 while setting pa = 0.3 and p a = 0.4. [...] To generate simulation data, we use our model and set the parameters L = 3, c = 3, r = 0.1, Λ0 = 1, q(t0) = 0.1, p0 = 0.1, pa = 0.4.