Riemannian Geometric-based Meta Learning

Authors: JuneYoung Park, YuMi Lee, Tae-Joon Kim, Jang-Hwan Choi

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on benchmark datasets including Omniglot, Mini Image Net, FC-100, and CUB demonstrate that Stiefel MAML consistently outperforms traditional MAML, achieving superior performance across various few-shot learning tasks. Our findings highlight the potential of Riemannian geometry to enhance meta-learning, paving the way for future research on optimizing over different geometric structures.
Researcher Affiliation Collaboration June Young Park1,2*, Yu Mi Lee3*, Tae-Joon Kim2 , Jang-Hwan Choi3 1Opt-AI Inc. 2Ajou University School of Medicine 3Ewha Womans University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Stiefel-MAML Algorithm Require: Task distribution p(T ), learning rates α, β, number of inner loop steps K Ensure: Model parameters θ 1: Initialize θ on Stiefel manifold St(n, p) 2: for each iteration do 3: Sample batch of tasks {Ti} p(T ) 4: for each task Ti do 5: Initialize task-specific parameters θi = θ 6: for k = 1 to K do 7: Inner loop: Riemannian gradient descent on St(n, p) 8: Compute loss LTi(θi) 9: Compute Riemannian gradient grad LTi(θi) 10: Update parameters: θi Rθi( α grad LTi(θi)) 11: end for 12: end for 13: Compute meta-gradient: Ti LTi(Rθi( α grad LTi(θi))) 14: Update meta parameters θ: θ Rθ( β θ X 15: end for
Open Source Code No The paper does not explicitly state that the source code for the methodology described in this paper is openly available, nor does it provide a specific repository link. It mentions using the 'torchmeta library (Deleu et al. 2019)' which is a third-party library, not the authors' own implementation code.
Open Datasets Yes Experimental results on benchmark datasets including Omniglot, Mini Image Net, FC-100, and CUB demonstrate that Stiefel MAML consistently outperforms traditional MAML, achieving superior performance across various few-shot learning tasks.
Dataset Splits Yes We conducted 1-shot and 5-shot learning for 3-way, 5-way, and 10-way tasks, reporting both accuracy and 95% confidence intervals (CI), as detailed in Table 1.
Hardware Specification Yes The experiments were conducted using an NVIDIA A6000 (48G) GPU.
Software Dependencies Yes We implemented our methodology using Python 3.8, Py Torch 1.8.1, and the torchmeta library (Deleu et al. 2019), ensuring a consistent experimental environment.
Experiment Setup Yes Following the approach of Finn et al. (2017), we sampled 60,000 episodes for our experiments. We adopted the 4-convolution architecture described by Vinyals et al. (2016) to implement our parameter update method. The learning rates were set to 0.01 for the inner loop and 0.001 for the outer loop. Additionally, we matched the number of gradient steps in the inner loop to those used in the experiments by Finn et al. (2017).