reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Remembering to Be Fair Again: Reproducing Non-Markovian Fairness in Sequential Decision Making

Authors: Domonkos Nagy, Lohithsai Yadala Chanchu, Krystof Bobek, Xin Zhou, Jacobus Smit

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We reproduce and extend their findings by validating their claims and introducing novel enhancements. We confirm that Fair QCM outperforms standard baselines in fairness enforcement and sample efficiency across different environments.
Researcher Affiliation	Academia	Domonkos Nagy EMAIL Informatics Institute University of Amsterdam Lohithsai Yadala Chanchu EMAIL Informatics Institute University of Amsterdam Kryštof Bobek EMAIL Informatics Institute University of Amsterdam Xin Zhou EMAIL Informatics Institute University of Amsterdam Martin Smit EMAIL Informatics Institute University of Amsterdam
Pseudocode	No	The paper describes methodologies and algorithms (Fair QCM, Fair SCM) but does not provide specific pseudocode blocks or algorithms formatted as figures or distinct sections.
Open Source Code	Yes	The original code, modified to be 70% more efficient, and our extensions are available on Git Hub: https://github.com/bozo22/remembering-to-be-fair-again.
Open Datasets	No	The paper uses the 'Resource Allocation (Donut)' and 'Simulated Lending' environments, citing previous work (Katoh and Ibaraki, 1998; Liu et al., 2018) for their definitions. It also 'created a COVID vaccine allocation gym environment'. These refer to problem formulations or simulation environments rather than publicly accessible datasets in the form of raw data files.
Dataset Splits	No	The paper describes reinforcement learning environments and experimental parameters (e.g., 'each consisting of 500 episodes of 100 time steps' for resource allocation, 'Episode Length 24' for COVID-19 simulation), but it does not specify traditional training/test/validation dataset splits, which are not typically applicable to dynamically generated data in RL.
Hardware Specification	Yes	We ran experiments using an AMD Ryzen 2600 CPU and a Nvidia RTX 3060 GPU.
Software Dependencies	No	The paper mentions using 'Stable-Baselines3 (Raffin et al. (2021))' for the SAC agent's implementation but does not specify a version number for this or any other software component used in the experiments.
Experiment Setup	Yes	Table 2: COVID-19 Simulation Hyperparameters Table 3: Resource Allocation Hyperparameters These tables list specific hyperparameters like Episode Length, Learning Rate, Discount Factor (γ), Replay Buffer Size, Batch Size, Soft Update Coefficient (τ), Entropy regularization coefficient, and Min Exploration Rate (ϵ).