Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Meta-Learning with Implicit Gradients

Authors: Aravind Rajeswaran, Chelsea Finn, Sham M. Kakade, Sergey Levine

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we show that these benefits of implicit MAML translate into empirical gains on few-shot image recognition benchmarks. In our experimental evaluation, we aim to answer the following questions empirically: (1) Does the i MAML algorithm asymptotically compute the exact meta-gradient? (2) With finite iterations, does i MAML approximate the meta-gradient more accurately compared to MAML? (3) How does the computation and memory requirements of i MAML compare with MAML? (4) Does i MAML lead to better results in realistic meta-learning problems? We have answered (1) (3) through our theoretical analysis, and now attempt to validate it through numerical simulations. For (1) and (2), we will use a simple synthetic example for which we can compute the exact meta-gradient and compare against it (exact-solve error, see definition 3). For (3) and (4), we will use the common few-shot image recognition domains of Omniglot and Mini-Image Net.
Researcher Affiliation Academia Aravind Rajeswaran University of Washington EMAIL Chelsea Finn University of California Berkeley EMAIL Sham M. Kakade University of Washington EMAIL Sergey Levine University of California Berkeley EMAIL
Pseudocode Yes Algorithm 1 Implicit Model-Agnostic Meta-Learning (i MAML) and Algorithm 2 Implicit Meta-Gradient Computation
Open Source Code Yes Project page: http://sites.google.com/view/imaml (This project page links to https://github.com/rllab/imaml_code, which hosts the iMAML code.)
Open Datasets Yes To study (3), we turn to the Omniglot dataset [30] which is a popular few-shot image recognition domain. ... Finally, we study empirical performance of i MAML on the Omniglot and Mini-Image Net domains.
Dataset Splits No The paper describes how tasks are sampled from a distribution P(T) and how each task has Dtr i and Dtest i sets. However, it does not specify a global train/validation/test split for the overall Omniglot or Mini-Image Net datasets, only the structure within individual tasks for meta-learning.
Hardware Specification Yes On the other hand, memory for MAML grows linearly in grad steps, reaching the capacity of a 12 GB GPU in approximately 16 steps.
Software Dependencies No The paper mentions "implemented i MAML in Py Torch" but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes i MAML with gradient descent (GD) uses 16 and 25 steps for 5-way and 20-way tasks respectively. i MAML with Hessian-free uses 5 CG steps to compute the search direction and performs line-search to pick step size. Both versions of i MAML use λ = 2.0 for regularization, and 5 CG steps to compute the task meta-gradient. We used λ = 0.5 and 10 gradient steps in the inner loop. ... 5 CG steps were used to compute the meta-gradient. The Hessian-free version also uses 5 CG steps for the search direction.