Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Authors: Paria Rashidinejad, Banghua Zhu, Cong Ma, Jiantao Jiao, Stuart Russell
NeurIPS 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We do not include any experiments. |
| Researcher Affiliation | Academia | Paria Rashidinejad Department of EECS UC Berkeley Berkeley, CA, 94709 EMAIL; Banghua Zhu Department of EECS UC Berkeley Berkeley, CA, 94709 EMAIL; Cong Ma Department of Statistics University of Chicago Chicago, IL, 60637 EMAIL; Jiantao Jiao Department of EECS UC Berkeley Berkeley, CA, 94709 EMAIL; Stuart Russell Department of EECS UC Berkeley Berkeley, CA, 94709 EMAIL |
| Pseudocode | Yes | Algorithm 1 LCB for bandits and contextual bandits; Algorithm 2 Offline value iteration with LCB (VI-LCB) |
| Open Source Code | No | We do not include any experiments. Our work does not use any assets. |
| Open Datasets | No | The paper explicitly states: 'We do not include any experiments.' and 'Our work does not use any assets.', indicating no dataset was used or provided by the authors for their work. |
| Dataset Splits | No | We do not include any experiments. |
| Hardware Specification | No | We do not include any experiments. |
| Software Dependencies | No | We do not include any experiments. |
| Experiment Setup | No | We do not include any experiments. |