Counterfactual Strategies for Markov Decision Processes
Authors: Paul Kobialka, Lina Gerlach, Francesco Leofante, Erika Ábrahám, Silvia Lizeth Tapia Tarifa, Einar Broch Johnsen
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on four real-world datasets and demonstrate its practical viability in sophisticated sequential decision-making tasks. |
| Researcher Affiliation | Academia | 1University of Oslo, Oslo, Norway 2RWTH Aachen University, Germany 3Imperial College London, United Kingdom EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the optimization problem using mathematical formulations (Constraints (1)-(13)) and prose, but does not include any clearly labeled pseudocode or algorithm blocks. The steps are described in regular paragraph text without structured formatting that would constitute pseudocode. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code, a direct link to a code repository, or mention of code in supplementary materials for the methodology described. It references an extended version on arXiv, but this is not a code release statement. |
| Open Datasets | Yes | In our experiments, we consider four real-world datasets. Grep S records customer interaction with a programming skill evaluation service [Kobialka et al., 2022]. BPIC12 [van Dongen, 2012] and BPIC17 [van Dongen, 2017], which record the loan application procedure in a bank, stem from the Business Process Intelligence Challenge2 of the IEEE Task Force on Process Mining.3 MSSD is the Music Streaming Sessions Dataset [Brost et al., 2019] from Spotify; we consider the small version of MSSD, with 10 000 listening sessions. |
| Dataset Splits | No | The paper mentions subsets of the MSSD dataset (e.g., MSSD10, MSSD40 representing 10% and 40% of the dataset) but does not provide specific training/test/validation dataset splits, exact percentages, sample counts, or explicit splitting methodology for reproducing experiments. It only describes how models were constructed based on data volume. |
| Hardware Specification | No | The paper describes the experimental setup and evaluation of the method's performance but does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers, such as programming languages, libraries, or solvers with their exact versions, that would be needed to replicate the experiments. |
| Experiment Setup | Yes | We randomly generate ten initial user strategies for each model and let the target probability ̓ range over {0.0001} {0.1, 0.2, . . . , 1}, where 0.0001 represents near-perfect performance. In this work, we use r0 = r1 = r = 1 and ̓ = 2 to weight each distance component equally and to weight diversity higher than distances. |