Provable Generalization of Overparameterized Meta-learning Trained with SGD

Authors: Yu Huang, Yingbin Liang, Longbo Huang

NeurIPS 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical findings are further validated by experiments. Figures 1, 2, and 3 provide experimental results.
Researcher Affiliation Academia Yu Huang IIIS Tsinghua University EMAIL Yingbin Liang Department of ECE The Ohio State University EMAIL Longbo Huang IIIS Tsinghua University EMAIL
Pseudocode Yes Algorithm 1 MAML with SGD
Open Source Code No The paper includes a self-assessment indicating code is provided in supplemental material, but the main text does not contain a specific statement or URL for open-source code.
Open Datasets No The paper uses a 'mixed linear regression model' and defines data distributions (e.g., 'x Rd is mean zero with covariance operator Σ = E[xx ]'). While it describes the model, it does not refer to or provide access information for a named public dataset.
Dataset Splits Yes Suppose that Dt is randomly split into training and validation sets, denoted respectively as Din t (Xin t , yin t ) and Dout t (Xout t , yout t ), correspondingly containing n1 and n2 samples (i.e., N = n1 + n2).
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes d = 500, T = 300, λi = 1 i log(i+1)2 , βtr = 0.02, βte = 0.2 (Figure 1 caption). d = 200, T = 100, Σθ = 0.82 d I, βte = 0.2 (Figure 2 caption). Let s = T log p(T) and d = T logq(T), where p, q > 0. Suppose Px is Gaussian and the spectrum of Σ satisfies λk = 1/s, k s 1/(d s), s + 1 k d. Suppose the spectral parameter νi of Σθ is O(1), and let the step size α = 1 2c(βtr,Σ) tr(Σ). (Proposition 4).