reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Formally Verified Approximate Policy Iteration

Authors: Maximilian Schäffeler, Mohammad Abdulaziz

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation on benchmark problems shows that it is practical. As part of the development, we build verified software to certify linear programming solutions. We discuss the verification process and the modifications we made to the algorithm during formalization. The results of the experiments (Table 1) show that the algorithm can deal with ring networks of half a million states and 20 actions.
Researcher Affiliation	Academia	1Technische Universit at M unchen, Germany 2King s College London, United Kingdom EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Approximate Policy Iteration (API) api(t, π, w) := if t tmax or err ϵ or w= then (t, π , w , err, w=) else api(t + 1, π , w ) where w := upd w(π) π := greedy π(w ) err := factored err(π , w ) w= := w = w
Open Source Code	Yes	Code https://github.com/schaeffm/fmdp isabelle
Open Datasets	No	The paper mentions using "the ring and star topologies from (Guestrin et al. 2003)" for evaluation. This refers to problem domains or models described in a referenced paper, not publicly available datasets in the conventional sense (e.g., raw data files). The text does not provide concrete access information (link, DOI, or repository) for any specific dataset.
Dataset Splits	No	The paper evaluates an algorithm on "ring and star topologies" from a benchmark paper. It does not use "datasets" in the typical machine learning sense with explicit training/test/validation splits. Therefore, information about dataset splits is not applicable and not provided.
Hardware Specification	Yes	We run our implementation on an Intel i7-11800H CPU and set the discount factor to 0.9 in all our experiments.
Software Dependencies	No	The paper mentions using 'Isabelle/HOL' for development, and 'QSopt ex' and 'So Plex' as LP solvers. It also states code is exported to Scala. However, no specific version numbers for the utilized software dependencies (e.g., Isabelle/HOL version, So Plex version, Scala version) are provided within the main text that are relevant for replication.
Experiment Setup	Yes	We run our implementation on an Intel i7-11800H CPU and set the discount factor to 0.9 in all our experiments. In all runs, the weights converged after max. 5 iterations.