Operationalising Rawlsian Ethics for Fairness in Norm Learning Agents
Authors: Jessica Woodgate, Paul Marshall, Nirav Ajmeri
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate RAWL E agents in simulated harvesting scenarios. We find that norms emerging in RAWL E agent societies enhance social welfare, fairness, and robustness, and yield higher minimum experience compared to those that emerge in agent societies that do not implement Rawlsian ethics. |
| Researcher Affiliation | Academia | Jessica Woodgate, Paul Marshall, Nirav Ajmeri School of Computer Science, University of Bristol, Bristol BS8 1UB, UK EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Ethics module. Input: Ut, Ut+1 Output: Ft+1; Algorithm 2: Norms module. Input: νt, at; Algorithm 3: Interaction module. Input: st |
| Open Source Code | Yes | Reproducibility Our codebase is publicly available (Woodgate, Marshall, and Ajmeri 2024a). The full version of this paper (Woodgate, Marshall, and Ajmeri 2024b) provides additional details including, computing infrastructure, parameter selection, a complete list of environmental rewards, further descriptions of metrics, a complete set of emerged norms, and additional details on simulation results. [...] Woodgate, J.; Marshall, P.; and Ajmeri, N. 2024a. Codebase for Operationalising Rawlsian Ethics for Fairness in Norm Learning Agents. https://doi.org/10.5281/zenodo.14520386. |
| Open Datasets | No | The paper describes a 'simulated harvesting scenario' but does not mention using any external publicly available datasets. The environment is self-contained and simulated. |
| Dataset Splits | No | The paper describes running simulations for 'e = 2000 times' for 'tmax = 50 steps' which refers to simulation episodes and steps, not dataset splits. It does not provide specific training/test/validation dataset split information. |
| Hardware Specification | No | The paper mentions 'computing infrastructure' in the reproducibility section but defers specific details to the full version or codebase. No specific hardware details (e.g., GPU/CPU models) are provided in the main text. |
| Software Dependencies | No | The paper mentions implementing 'RL with deep Q network (DQN) architecture', but does not specify versions for any ancillary software such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | At the beginning of each episode, the grid is initialised with k = 4 agents, and binitial = 12 berries at random locations. An agent begins with hinitial = 5.0 health. Agents may collect berries, throw berries to other agents, or eat berries. An agent receives a gain in health hgain = 0.1 when it eats a berry. Agent health decays hdecay = 0.01 at every time step. [...] For testing, we run each simulation e = 2000 times, with each simulation running until all agents have died, or a maximum of tmax = 50 steps. [...] Table 1 lists the norm parameters. Parameter Description Value tclip behaviours Clip behaviour base frequency 10.0 tclip norms Clip norm base frequency 5.0 |