reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Operationalising Rawlsian Ethics for Fairness in Norm Learning Agents

Authors: Jessica Woodgate, Paul Marshall, Nirav Ajmeri

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate RAWL E agents in simulated harvesting scenarios. We find that norms emerging in RAWL E agent societies enhance social welfare, fairness, and robustness, and yield higher minimum experience compared to those that emerge in agent societies that do not implement Rawlsian ethics.
Researcher Affiliation	Academia	Jessica Woodgate, Paul Marshall, Nirav Ajmeri School of Computer Science, University of Bristol, Bristol BS8 1UB, UK EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Ethics module. Input: Ut, Ut+1 Output: Ft+1; Algorithm 2: Norms module. Input: νt, at; Algorithm 3: Interaction module. Input: st
Open Source Code	Yes	Reproducibility Our codebase is publicly available (Woodgate, Marshall, and Ajmeri 2024a). The full version of this paper (Woodgate, Marshall, and Ajmeri 2024b) provides additional details including, computing infrastructure, parameter selection, a complete list of environmental rewards, further descriptions of metrics, a complete set of emerged norms, and additional details on simulation results. [...] Woodgate, J.; Marshall, P.; and Ajmeri, N. 2024a. Codebase for Operationalising Rawlsian Ethics for Fairness in Norm Learning Agents. https://doi.org/10.5281/zenodo.14520386.
Open Datasets	No	The paper describes a 'simulated harvesting scenario' but does not mention using any external publicly available datasets. The environment is self-contained and simulated.
Dataset Splits	No	The paper describes running simulations for 'e = 2000 times' for 'tmax = 50 steps' which refers to simulation episodes and steps, not dataset splits. It does not provide specific training/test/validation dataset split information.
Hardware Specification	No	The paper mentions 'computing infrastructure' in the reproducibility section but defers specific details to the full version or codebase. No specific hardware details (e.g., GPU/CPU models) are provided in the main text.
Software Dependencies	No	The paper mentions implementing 'RL with deep Q network (DQN) architecture', but does not specify versions for any ancillary software such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch/TensorFlow versions).
Experiment Setup	Yes	At the beginning of each episode, the grid is initialised with k = 4 agents, and binitial = 12 berries at random locations. An agent begins with hinitial = 5.0 health. Agents may collect berries, throw berries to other agents, or eat berries. An agent receives a gain in health hgain = 0.1 when it eats a berry. Agent health decays hdecay = 0.01 at every time step. [...] For testing, we run each simulation e = 2000 times, with each simulation running until all agents have died, or a maximum of tmax = 50 steps. [...] Table 1 lists the norm parameters. Parameter Description Value tclip behaviours Clip behaviour base frequency 10.0 tclip norms Clip norm base frequency 5.0