Polynomial-Time Approximability of Constrained Reinforcement Learning
Authors: Jeremy Mcmahan
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the computational complexity of approximating general constrained Markov decision processes. Our primary contribution is the design of a polynomial time (0, ̈)-additive bicriteria approximation algorithm for finding optimal constrained policies across a broad class of recursively computable constraints, including almost-sure, chance, expectation, and their anytime variants. Matching lower bounds imply our approximation guarantees are optimal so long as P = NP. The generality of our approach results in answers to several long-standing open complexity questions in the constrained reinforcement learning literature. |
| Researcher Affiliation | Academia | Department of Computer Science, University of Wisconsin Madison, Wisconsin, USA. Correspondence to: Jeremy Mc Mahan <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Reduction Algorithm 2 Augmented Interaction Algorithm 3 Approximate Backward Induction |
| Open Source Code | No | The paper does not contain any explicit statements about code availability, links to repositories, or supplementary materials containing source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and focuses on algorithm design and computational complexity analysis of constrained Markov decision processes. It does not conduct experiments or use any specific datasets, therefore, no information about publicly available datasets is provided. |
| Dataset Splits | No | The paper is theoretical and does not involve experiments with datasets, so there are no dataset splits mentioned. |
| Hardware Specification | No | The paper is theoretical and focuses on algorithm design and analysis. It does not describe any experimental setups or specify hardware used. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithms and complexity analysis. It does not mention any specific software dependencies or version numbers. |
| Experiment Setup | No | The paper is theoretical and describes algorithms and their complexity without detailing any practical experimental setup, hyperparameters, or training configurations. |