Adaptation Augmented Model-based Policy Optimization
Authors: Jian Shen, Hang Lai, Minghuan Liu, Han Zhao, Yong Yu, Weinan Zhang
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on challenging continuous control tasks show that FAMPO and IAMPO, coupled with our model usage technique, achieves superior performance against baselines, which demonstrates the effectiveness of the proposed methods. Keywords: Model-based reinforcement learning, distribution shift, occupancy measure, Integral Probability Metric, importance sampling |
| Researcher Affiliation | Academia | Jian Shen EMAIL Hang Lai EMAIL Minghuan Liu EMAIL Han Zhao EMAIL Yong Yu EMAIL Weinan Zhang EMAIL Department of Computer Science, Shanghai Jiao Tong University Department of Computer Science, University of Illinois, Urbana-Champaign |
| Pseudocode | Yes | Algorithm 1 FAMPO ... Algorithm 2 IAMPO |
| Open Source Code | No | The paper states: "We implement all our experiments using Tensor Flow." However, it does not explicitly state that the code for the described methodology is released or provide a link to a code repository. |
| Open Datasets | Yes | We evaluate our methods and other baselines on six Mu Jo Co continuous control tasks from Open AI Gym (Brockman et al., 2016) |
| Dataset Splits | No | The paper describes dynamic data collection into environment and model buffers (Denv and Dmodel) and how samples are drawn from them for training. It does not provide specific fixed dataset splits (e.g., percentages or counts for training, validation, and testing sets) in the traditional supervised learning sense for reproducibility. |
| Hardware Specification | No | The paper mentions implementing experiments using Tensor Flow and using MuJoCo environments, but it does not specify any particular hardware components (e.g., GPU models, CPU types, or cloud computing specifications) used for conducting the experiments. |
| Software Dependencies | No | The paper states: "We implement all our experiments using Tensor Flow." While a software library is mentioned, a specific version number for Tensor Flow or any other software dependency is not provided. |
| Experiment Setup | Yes | Other important hyperparameters used in our methods are chosen by grid search and detailed hyperparameter settings can be found in Appendix E. Table 1: Common hyperparameters for FAMPO and IAMPO. Table 2: Distinct hyperparameters for FAMPO. Table 3: Distinct hyperparameters for IAMPO. |