Operationalising Rawlsian Ethics for Fairness in Norm Learning Agents

Authors: Jessica Woodgate, Paul Marshall, Nirav Ajmeri

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate RAWL E agents in simulated harvesting scenarios. We find that norms emerging in RAWL E agent societies enhance social welfare, fairness, and robustness, and yield higher minimum experience compared to those that emerge in agent societies that do not implement Rawlsian ethics.
Researcher Affiliation Academia Jessica Woodgate, Paul Marshall, Nirav Ajmeri School of Computer Science, University of Bristol, Bristol BS8 1UB, UK EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Ethics module. Input: Ut, Ut+1 Output: Ft+1; Algorithm 2: Norms module. Input: νt, at; Algorithm 3: Interaction module. Input: st
Open Source Code Yes Reproducibility Our codebase is publicly available (Woodgate, Marshall, and Ajmeri 2024a). The full version of this paper (Woodgate, Marshall, and Ajmeri 2024b) provides additional details including, computing infrastructure, parameter selection, a complete list of environmental rewards, further descriptions of metrics, a complete set of emerged norms, and additional details on simulation results. [...] Woodgate, J.; Marshall, P.; and Ajmeri, N. 2024a. Codebase for Operationalising Rawlsian Ethics for Fairness in Norm Learning Agents. https://doi.org/10.5281/zenodo.14520386.
Open Datasets No The paper describes a 'simulated harvesting scenario' but does not mention using any external publicly available datasets. The environment is self-contained and simulated.
Dataset Splits No The paper describes running simulations for 'e = 2000 times' for 'tmax = 50 steps' which refers to simulation episodes and steps, not dataset splits. It does not provide specific training/test/validation dataset split information.
Hardware Specification No The paper mentions 'computing infrastructure' in the reproducibility section but defers specific details to the full version or codebase. No specific hardware details (e.g., GPU/CPU models) are provided in the main text.
Software Dependencies No The paper mentions implementing 'RL with deep Q network (DQN) architecture', but does not specify versions for any ancillary software such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch/TensorFlow versions).
Experiment Setup Yes At the beginning of each episode, the grid is initialised with k = 4 agents, and binitial = 12 berries at random locations. An agent begins with hinitial = 5.0 health. Agents may collect berries, throw berries to other agents, or eat berries. An agent receives a gain in health hgain = 0.1 when it eats a berry. Agent health decays hdecay = 0.01 at every time step. [...] For testing, we run each simulation e = 2000 times, with each simulation running until all agents have died, or a maximum of tmax = 50 steps. [...] Table 1 lists the norm parameters. Parameter Description Value tclip behaviours Clip behaviour base frequency 10.0 tclip norms Clip norm base frequency 5.0