Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

A Model of Fake Data in Data-driven Analysis

Authors: Xiaofan Li, Andrew B. Whinston

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We fully solve the model and employ numerical examples to illustrate the players strategies and payoffs for insights. Specifically, our results show that maintaining some suspicion about the data sources and understanding that the sender can be strategic are very helpful to the data receiver. [...] In this section, we illustrate the fact that one cannot be perfectly accurate in detecting fake data by looking at each piece of data separately when the distributions of true and fake data have the same support, meaning that the data have strictly positive likelihoods in the distributions of both the true and fake data. We then use simulation data to show that our method addresses this problem.
Researcher Affiliation Academia Xiaofan Li EMAIL Andrew B. Whinston EMAIL Mc Combs School of Business The University of Texas at Austin 2110 Speedway Austin, TX 78705, USA
Pseudocode No The paper describes the model and its solution using mathematical equations and derivations, but it does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described. There are no links to repositories, explicit code release statements, or mentions of code in supplementary materials.
Open Datasets No We use point processes to model the data traffic, where each piece of data can occur at any discrete moment in a continuous time flow. [...] To generate simulation data, we use our model and set the parameters L = 3, c = 3, r = 0.1, Λ0 = 1, q(t0) = 0.1, p0 = 0.1, pa = 0.4. We look at the how the belief of both types receivers is updated in the first 200 units of time. An example is shown in Figure 8. [...] The paper focuses on a theoretical model and generates its own simulation data; it does not use or provide access to external open datasets.
Dataset Splits No The paper develops a game-theoretic model and uses simulation data generated by the model itself, not external datasets that would require specific training/test/validation splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its numerical examples or simulations.
Software Dependencies No The paper does not specify any software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers that would be needed to replicate the experiments or simulations.
Experiment Setup Yes Specifically, we set L = 3, c = 3, r = 0.1, Λ0 = 1, p0 = 0.1, and compare the strategies and payoffs between cases where pa = 0.3 and pa = 0.4. [...] Specifically, we keep assuming L = 3, c = 3, r = 0.1, Λ0 = 1, p0 = 0.1 while setting pa = 0.3 and p a = 0.4. [...] To generate simulation data, we use our model and set the parameters L = 3, c = 3, r = 0.1, Λ0 = 1, q(t0) = 0.1, p0 = 0.1, pa = 0.4.