Stateful Posted Pricing with Vanishing Regret via Dynamic Deterministic Markov Decision Processes
Authors: Yuval Emek, Ron Lavi, Rad Niazadeh, Yangguang Shi
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We then prove that if the Markov decision process is guaranteed to admit an oracle that can simulate any given policy from any initial state with bounded loss a condition that is satisfied in the DRACC problem then the online learning problem can be solved with vanishing regret. Our proof technique is based on a reduction to online learning with switching cost, in which an online decision maker incurs an extra cost every time she switches from one arm to another. |
| Researcher Affiliation | Academia | Yuval Emek Technion Israel Institute of Technology Haifa, Israel EMAIL Ron Lavi Technion Israel Institute of Technology Haifa, Israel EMAIL Rad Niazadeh University of Chicago Booth School of Business Chicago, IL, United States EMAIL Yangguang Shi Technion Israel Institute of Technology Haifa, Israel EMAIL |
| Pseudocode | Yes | ALGORITHM 1: Online Dd-MDP algorithm C&S |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not describe experiments using a dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments using a dataset. |
| Hardware Specification | No | The paper is theoretical and does not report on experiments requiring specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not report on experiments requiring specific software dependencies. |
| Experiment Setup | No | The paper is theoretical and does not report on experiments, thus no experimental setup details are provided. |