Predicting Strategic Behavior from Free Text
Authors: Omer Ben-Porat, Sharon Hirsch, Lital Kuchy, Guy Elad, Roi Reichart, Moshe Tennenholtz
JAIR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments with three well-studied games, our algorithm compares favorably with strong alternative approaches. In ablation analysis, we demonstrate the importance of our modeling choices the representation of the text with the commonsensical personality attributes and our classifier to the predictive power of our model. |
| Researcher Affiliation | Academia | Faculty of Industrial Engineering and Management Technion Israel Institute of Technology, Israel Faculty of Computer Science Technion Israel Institute of Technology, Israel |
| Pseudocode | No | The paper describes the clustering algorithm used in text, stating: "Particularly, we cluster the example set X with a bottom-up agglomerative clustering algorithm using Ward s minimum variance criterion for cluster merging (Ward Jr., 1963), which is the default linkage method in the package we employed, Scikit-learn (Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Dubourg, et al., 2011)." However, it does not present this algorithm, or any other procedure, in a structured pseudocode block or a clearly labeled algorithm section. |
| Open Source Code | Yes | Our data set and the all the information relevant for the data collection crowd-sourcing tasks are publicly available here: https://github.com/omerbp/Predicting-NLPGT. |
| Open Datasets | Yes | Our data set and the all the information relevant for the data collection crowd-sourcing tasks are publicly available here: https://github.com/omerbp/Predicting-NLPGT. |
| Dataset Splits | Yes | We randomly sample train and test sets, S and S , such that the training set is comprised of 90% of the data. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions "Scikit-learn" as the package employed for clustering but does not provide a specific version number. It also references "IBM Personality Insights service" and "Linguistic Inquiry and Word Count (LIWC)" as tools, but again, no specific version numbers for their usage within the experiments are given. Citations for underlying theories or components (like Glove word embeddings or Scikit-learn's original publication) are provided but do not specify the version used in the experiment. |
| Experiment Setup | Yes | For TAC we focus our evaluation on the range of 2 30 clusters. For K-NN we consider K {1, . . . , 5}. For the clustering with tf-idf representation, we consider 2 30 clusters, as for TAC, and compute tf-idf for the 1904 vocabulary words after removing stop words and punctuation marks. Hyper-parameter values that give the best results (upper table) are: K-NN: (1, 1, 1) neighbors, TAC: (13, 30, 26) clusters, TAC-IBM-13: (30, 25, 8) clusters, TAC-IBM-37: (4, 23, 19) clusters, TAC-LIWC-19: (17, 11, 14) clusters, TAC-LIWC-43: (28, 20, 30) clusters, and Trans Text Cluster: (28, 25, 28) clusters, for (Chicken, Box, Door), respectively. |