reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multiple Mean-Payoff Optimization Under Local Stability Constraints

Authors: David Klaška, Antonín Kučera, Vojtěch Kůr, Vít Musil, Vojtěch Řehák

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we design and evaluate the first efficient and scalable solution to this problem applicable to Markov decision processes. Our Contribution. We design the first efficient and scalable strategy synthesis algorithm for optimizing multiple window mean payoffs in a given Markov decision process. Our experiments show that the algorithm can construct high-quality (and sometimes quite sophisticated) strategies for complicated instances of non-trivial size. In our experiments, we concentrate on the following: (1) Demonstrating the improvement achieved by the dynamic procedure described above, where the baseline is a naive DFS-based procedure. (2) Evaluating the scalability of our algorithm and the quality of the constructed strategies. (3) Analyzing the power of randomization to compensate for insufficient memory.
Researcher Affiliation	Academia	Masaryk University, Brno, Czechia EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Evaluation via dynamic programming; Algorithm 2: Evaluation via DFS
Open Source Code	Yes	Code https://gitlab.fi.muni.cz/formela/2025-aaai-mmp
Open Datasets	No	The paper uses
Dataset Splits	No	The paper describes constructing graphs Dℓ for experiments but does not provide details on how data within these graphs is split for training, validation, or testing, as it's not a typical supervised learning setup.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	Our implementation uses PYTORCH framework (Paszke et al. 2019) and its automatic differentiation with ADAM optimizer (Kingma and Ba 2015). While PyTorch and ADAM are mentioned, specific version numbers for these software components are not provided.
Experiment Setup	Yes	For all scenarios (ℓ, d) and K {1, . . . , 5}, we invoked WINMPSYNT 100 times, where K memory states are assigned to every vertex. The number of optimization steps is set to 1000.