Quantum Algorithms for Finite-horizon Markov Decision Processes

Authors: Bin Luo, Yuwen Huang, Jonathan Allcock, Xiaojun Lin, Shengyu Zhang, John C.S. Lui

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we design quantum algorithms that are more efficient than classical algorithms to solve time-dependent and finite-horizon Markov Decision Processes (MDPs) in two distinct settings: (1) In the exact dynamics setting, where the agent has full knowledge of the environment s dynamics (i.e., transition probabilities), we prove that our Quantum Value Iteration (QVI) algorithm QVI-1 achieves a quadratic speedup... (2) In the generative model setting... we prove that our algorithms QVI-3 and QVI-4 achieve improvements in sample complexity... More importantly, we prove quantum lower bounds to show that QVI-3 and QVI-4 are asymptotically optimal...
Researcher Affiliation Collaboration 1The Chinese University of Hong Kong, Hong Kong, China 2Tencent Quantum Laboratory, Hong Kong, China. Correspondence to: Yuwen Huang <EMAIL>.
Pseudocode Yes Algorithm 1 Quantum Value Iteration QVI-1(M, δ) ... Algorithm 2 Quantum Value Iteration QVI-2(M, ϵ, δ) ... Algorithm 3 Quantum Mean Estimation with Binary Oracles QMEBOδ(p Tf, Bp, Bf, ϵ) ... Algorithm 4 Quantum Value Iteration QVI-3(M, ϵ, δ) ... Algorithm 5 Quantum Value Iteration QVI-4(M, ϵ, δ) ... Algorithm 6 Value Iteration (Backward Induction) Algorithm for Finite Horizon MDPs
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the methodology described, nor does it provide any links to a code repository.
Open Datasets No The paper discusses theoretical models (Markov Decision Processes, generative models) for which algorithms are designed and analyzed. It does not utilize or mention specific datasets for empirical evaluation, hence no information on open datasets is provided.
Dataset Splits No The paper is theoretical and focuses on algorithm design and complexity analysis for Markov Decision Processes. It does not involve empirical evaluation on datasets, and therefore, no dataset splits are discussed or specified.
Hardware Specification No The paper is theoretical and focuses on quantum algorithms and their computational complexities. It does not describe any experiments or simulations that would require specific hardware, thus no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical, presenting quantum algorithms and their complexity analysis. It does not mention specific software or libraries with version numbers that would be required to implement or reproduce the work.
Experiment Setup No The paper is theoretical and focuses on the design and analysis of quantum algorithms for MDPs. It does not include any empirical experiments, and therefore, no experimental setup details like hyperparameters or training configurations are provided.