reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Posterior Sampling for Reinforcement Learning on Graphs

Authors: Arnaud Robert, Aldo A. Faisal, Ciara Pike-Burke

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also provide empirical validation of our method s performance gain, first on a maximum flow problem and then on a wind farm optimization problem. To summarize, this paper proposes, analyses and evaluates a novel posterior sampling algorithm specifically designed to exploit the graphical structure present in many real-world problems. We then conclude by empirically demonstrating that by harnessing the DAMDP, our algorithm outperforms traditional posterior sampling for Reinforcement Learning in both a maximum flow problem and a real-world wind farm optimisation task.
Researcher Affiliation	Academia	Arnaud Robert EMAIL Department of Computing Imperial College London A. Aldo Faisal EMAIL Department of Computing Imperial College London Ciara Pike-Burke EMAIL Department of Mathematics Imperial College London
Pseudocode	Yes	Algorithm 1 Planning on a DAMDP ... Algorithm 2 Posterior sampling on graph MDPs (PSGRL)
Open Source Code	No	The paper mentions the FLORIS simulator code but does not provide its own implementation code for the methodology described in the paper. The only relevant text is: "The code for the FLORIS simulator is available at the following address: https://github.com/NREL/floris"
Open Datasets	No	The paper describes experiments on a "maximum leaky flow problem" and a "wind farm yield optimisation task" using a simulator (FLORIS). It does not provide concrete access information (link, DOI, repository, or formal citation for a specific dataset) for the data used in these experiments or for the simulator's output.
Dataset Splits	No	The paper describes experiments and shows results like regret curves, but it does not specify any training/test/validation dataset splits or cross-validation setup for the data used in the experiments. It mentions running experiments with "ten different seeds" for statistical robustness, which is not dataset splitting.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU models, CPU types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions "FLORIS, a wind farm simulation software (Annoni et al., 2018)" and provides a GitHub link for it. However, it does not specify a version number for FLORIS or any other software dependencies, which is required for reproducibility.
Experiment Setup	No	The paper describes problem-specific discretizations for the wind farm task (e.g., "discretize the atomic action Y = {30 , 0 , 30 }" and "discretize the state and consider all increments of 0.1m/s from 6m/s to 10m/s"). However, it does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or system-level training settings for the PSGRL or PSRL algorithms themselves.