Cooperation and Learning Dynamics under Wealth Inequality and Diversity in Individual Risk
Authors: Ramona Merhej, Fernando P. Santos, Francisco S. Melo, Francisco C. Santos
JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We draw our conclusions based on social simulations with populations of independent reinforcement learners with diverse levels of risk and wealth. |
| Researcher Affiliation | Academia | Ramona Merhej EMAIL INESC-ID and Instituto Superior T ecnico, Lisbon, Portugal ISIR, CNRS, Sorbonne University, Paris, France Fernando P. Santos EMAIL Informatics Institute University of Amsterdam, The Netherlands Francisco S. Melo EMAIL Francisco C. Santos EMAIL INESC-ID and Instituto Superior T ecnico Universidade de Lisboa, Portugal |
| Pseudocode | Yes | Algorithm 1: Roth-Erev RL algorithm in an adaptive population with asynchronous updates of propensities. Algorithm 2: Sampling with assortment bias. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide any links to a code repository. |
| Open Datasets | No | The paper describes conducting "social simulations" and defines the game parameters and agent learning dynamics. It does not utilize or provide access to any pre-existing public datasets, as the data is generated through the simulations themselves. |
| Dataset Splits | No | The paper describes simulation parameters and training steps (e.g., "2.5 x 10^5 learning steps", "averaged over 5 independent runs") but does not mention dataset splits in the traditional sense of dividing a pre-existing dataset into training, validation, and test sets. |
| Hardware Specification | No | The paper mentions "computer simulations" but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run these experiments. |
| Software Dependencies | No | The paper discusses various algorithms (e.g., Roth-Erev Algorithm, Q-learning) but does not specify any software libraries, frameworks, or their version numbers used for implementing the simulations. |
| Experiment Setup | Yes | In all settings, we consider a population of Z = 200 individuals. The average wealth in the population is set to b = 1 yielding W = Z. A contribution represents 10% of an agent s wealth, i.e., c = 0.1. We set the target to be achievable if at least M = N/2 agents in the group contribute, i.e., t = Ncb/2. If the threshold target is not achieved, agents lose an additional 70% of their remaining wealth, i.e., p = 0.7. We test varying risk values r {0.1, 0.3, 0.5, 0.7, 0.9}, varying group sizes N {2, 4, 6, 8, 10, 20} and varying risk perception diversity factors δ {0.1, 0.2, 0.3, 0.4, 0.5}. We sample qi,0(A) from a normal distribution N(µ = 10, σ = 1). The forgetting parameter is set to ϕ = 0.001. In all sections, the evaluation proceeds by allowing the agents to train for a total of 2.5 10^5 learning steps, while imposing a minimum number of K = 3 10^4 learning steps for every agent. The values reported in the three criteria correspond to the values observed at the end of the training period, averaged over 5 independent runs. |