Level-0 Models for Predicting Human Behavior in Games
Authors: James R. Wright, Kevin Leyton-Brown
JAIR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated the effects of combining these new level-0 models with several iterative models and observed large improvements in predictive accuracy. evaluated using experimental data (Camerer, 2003). Our own recent work has identified one particular model, quantal cognitive hierarchy an extension of the cognitive hierarchy model of Camerer, Ho, and Chong (2004) as the state of the art behavioral model for predicting human play in unrepeated, simultaneous-move games (Wright & Leyton-Brown, 2012, 2017). We analyzed data from the ten experimental studies summarized in Table 1. We randomly divided our data into training and test datasets using 10-fold cross-validation. |
| Researcher Affiliation | Academia | James R. Wright EMAIL Computing Science Department, University of Alberta Edmonton, AB, Canada T6G 2E8 Kevin Leyton-Brown EMAIL Computer Science Department, University of British Columbia, Vancouver, BC, Canada V6T 1Z4 |
| Pseudocode | No | The paper describes mathematical definitions and theoretical concepts but does not include any explicitly labeled pseudocode or algorithm blocks. The methods are described in prose and mathematical notation. |
| Open Source Code | No | The paper mentions using third-party software like 'Py MC software package (Patil, Huard, & Fonnesbeck, 2010)' and 'SMAC (Hutter, Hoos, & Leyton-Brown, 2010, 2011, 2012)', but it does not provide an explicit statement or link for the source code of the methodology developed in this paper. |
| Open Datasets | Yes | We analyzed data from the ten experimental studies summarized in Table 1. Several studies (Stahl & Wilson, 1994, 1995; Haruvy, Stahl, & Wilson, 2001; Haruvy & Stahl, 2007; Stahl & Haruvy, 2008) paid participants according to a randomized procedure in which experimental subjects played normal-form games for points representing a 1% chance (per game) of winning a cash prize. In the work of Costa-Gomes, Crawford, and Broseta (1998), each payoffunit was worth 40 cents, but participants were paid based on the outcome of only one randomly-selected game. |
| Dataset Splits | Yes | We randomly divided our data into training and test datasets using 10-fold cross-validation. Specifically, for each round, we randomly ordered the games and then divided them into 10 equal-sized parts. For each of the 10 ways of selecting 9 parts from the 10, we computed the maximum likelihood estimate of the model s parameters based on the observations associated with the games of those 9 parts. To reduce this variance, we performed 10 rounds of 10-fold cross-validation, and report the average of these 10 rounds. For the experiments described in this section, we randomly selected 10% of the All10 dataset as a held-out test set. The remaining 90% of the data was used as a training data set (80% of the original data) and a validation set (10% of the original data). |
| Hardware Specification | No | The paper mentions 'devoting about 9 CPU months to this search' as a measure of computational effort for Bayesian optimization, but it does not specify any particular CPU models, GPU types, or other hardware components used for the experiments. |
| Software Dependencies | No | The paper mentions using 'the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm (Hansen & Ostermeier, 2001)', 'the Py MC software package (Patil, Huard, & Fonnesbeck, 2010)', and 'SMAC (Hutter, Hoos, & Leyton-Brown, 2010, 2011, 2012)'. However, it does not provide specific version numbers for these software packages or libraries. |
| Experiment Setup | No | The paper describes the general approach to parameter estimation (likelihood maximization using CMA-ES, Bayesian optimization, flat priors for Metropolis-Hastings) and the parameters of the model (τ, λ, and feature weights), but it does not provide specific hyperparameter values (e.g., learning rates, batch sizes, number of epochs for optimization) that define the training process configuration. |