Improving Generalization with Approximate Factored Value Functions
Authors: Shagun Sodhani, Sergey Levine, Amy Zhang
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically verify the effectiveness of our approach in terms of faster training (better sample complexity) and robust zero-shot transfer (better generalization) on the Proc Gen benchmark and the Mini Grid environments. |
| Researcher Affiliation | Collaboration | Shagun Sodhani EMAIL Meta AI Sergey Levine EMAIL University of California, Berkeley Amy Zhang EMAIL Meta AI UT Austin |
| Pseudocode | Yes | Algorithm 1 AFa R algorithm. |
| Open Source Code | No | The paper does not explicitly state that the source code for AFaR is being released or provide a direct link to a code repository for AFaR. It mentions a project website and lists implementations for baseline algorithms used (RIDE, Dr AC, IDAAC), but not for the proposed AFaR method itself. |
| Open Datasets | Yes | We use the Procgen benchmark (Cobbe et al., 2020) and the Mini Grid environments (Chevalier-Boisvert et al., 2018) to evaluate the effectiveness of the proposed AFa R algorithm. |
| Dataset Splits | Yes | Following the setup in Raileanu et al. (2020), we train the agent on a fixed set of 200 levels while testing on the full distribution of levels. In practice, this is simulated by sampling levels at random during evaluation. ... We also perform an ablation where we train the systems using just 10 levels (instead of 200 levels)... In the first case, we train and evaluate the agents on a given environment. ... In the second case, we train the agent on one environment and evaluate it on a different environment in a zero-shot manner. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. It mentions that more details about the experimental setup can be found in Appendix E, but Appendix E does not contain hardware specifications. |
| Software Dependencies | No | The paper lists several open-source libraries in Appendix C.1 (Py Torch, Hydra, Numpy, Pandas, RIDE Implementation, Dr AC Implementation, IDAAC Implementation) but does not specify version numbers for these dependencies, which is required for a reproducible description. |
| Experiment Setup | No | The main text of the paper states: "More details about our experimental setup and hyperparameters can be found in Appendix E." However, the specific hyperparameter values and training configurations, such as learning rates or batch sizes, are not provided within the main body of the paper, but are instead relegated to tables in the appendices. |