Improving Generalization with Approximate Factored Value Functions

Authors: Shagun Sodhani, Sergey Levine, Amy Zhang

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically verify the effectiveness of our approach in terms of faster training (better sample complexity) and robust zero-shot transfer (better generalization) on the Proc Gen benchmark and the Mini Grid environments.
Researcher Affiliation Collaboration Shagun Sodhani EMAIL Meta AI Sergey Levine EMAIL University of California, Berkeley Amy Zhang EMAIL Meta AI UT Austin
Pseudocode Yes Algorithm 1 AFa R algorithm.
Open Source Code No The paper does not explicitly state that the source code for AFaR is being released or provide a direct link to a code repository for AFaR. It mentions a project website and lists implementations for baseline algorithms used (RIDE, Dr AC, IDAAC), but not for the proposed AFaR method itself.
Open Datasets Yes We use the Procgen benchmark (Cobbe et al., 2020) and the Mini Grid environments (Chevalier-Boisvert et al., 2018) to evaluate the effectiveness of the proposed AFa R algorithm.
Dataset Splits Yes Following the setup in Raileanu et al. (2020), we train the agent on a fixed set of 200 levels while testing on the full distribution of levels. In practice, this is simulated by sampling levels at random during evaluation. ... We also perform an ablation where we train the systems using just 10 levels (instead of 200 levels)... In the first case, we train and evaluate the agents on a given environment. ... In the second case, we train the agent on one environment and evaluate it on a different environment in a zero-shot manner.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. It mentions that more details about the experimental setup can be found in Appendix E, but Appendix E does not contain hardware specifications.
Software Dependencies No The paper lists several open-source libraries in Appendix C.1 (Py Torch, Hydra, Numpy, Pandas, RIDE Implementation, Dr AC Implementation, IDAAC Implementation) but does not specify version numbers for these dependencies, which is required for a reproducible description.
Experiment Setup No The main text of the paper states: "More details about our experimental setup and hyperparameters can be found in Appendix E." However, the specific hyperparameter values and training configurations, such as learning rates or batch sizes, are not provided within the main body of the paper, but are instead relegated to tables in the appendices.