reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Meta-Learning via Classifier(-free) Diffusion Guidance

Authors: Elvis Nava, Seijin Kobayashi, Yifei Yin, Robert K. Katzschmann, Benjamin F Grewe

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that our approaches outperform existing multi-task and meta-learning methods in a series of zero-shot learning experiments on our Meta-VQA dataset.
Researcher Affiliation	Academia	Elvis Nava EMAIL ETH AI Center & INI & Soft Robotics Lab, ETH Zurich Seijin Kobayashi EMAIL Dept. of Computer Science, ETH Zurich Yifei Yin EMAIL Dept. of Computer Science, ETH Zurich Robert K. Katzschmann EMAIL Soft Robotics Lab, D-MAVT, ETH Zurich Benjamin F. Grewe EMAIL Institute of Neuroinformatics, University of Zurich & ETH Zurich
Pseudocode	Yes	Algorithm 1 Hyper CLIP Training Algorithm 2 Unconditional Multitask Training Algorithm 3 Unconditional MNet-MAML Training Algorithm 4 Unconditional HNet-MAML Training Algorithm 5 Conditional Multitask Training Algorithm 6 Conditional Multitask Fi LM Training Algorithm 7 Conditional HNet-MAML Training Algorithm 8 HVAE Training Algorithm 9 HNet + Hyper CLIP Training Algorithm 10 HVAE + Hyper CLIP Training Algorithm 11 Hyper CLIP guidance (Inference time) Algorithm 12 HNet + Hyper LDM Training Algorithm 13 HVAE + Hyper LDM Training Algorithm 14 Hyper LDM Inference
Open Source Code	Yes	Our code is available at https://github.com/elvisnava/hyperclip.
Open Datasets	Yes	We demonstrate the usefulness of our methods on Meta-VQA, our modification of the VQA v2.0 dataset (Goyal et al., 2017) built to reflect the multi-task setting with natural language task descriptors.
Dataset Splits	Yes	In the end, our Meta-VQA dataset is composed of 1234 unique tasks (questions), split into 870 training tasks and 373 test tasks, for a total of 104112 image-answer pairs. There are on average 9.13 answer choices per question/task. The average size of the support set is 57.85 examples, while the average size of the query set is 25.9 examples.
Hardware Specification	No	The paper does not explicitly describe the hardware used for its experiments. It mentions the base network model (CLIP-Adapter with Vi T-L/14@336px CLIP encoder) and its advantages for
Software Dependencies	No	The paper mentions several techniques and models like Adam optimizer, CLIP, PyTorch (implicitly through models), but does not provide specific version numbers for any software dependencies. For example, it cites Adam (Kingma & Ba, 2017) but not a specific version of the Adam implementation used.
Experiment Setup	Yes	Table 3: Hyperparameters used for the baseline methods. All methods are trained with the Adam (Kingma & Ba, 2017) optimizer, with a meta-batch size of 32 tasks. We use gradient norm clipping for all optimization, with the maximum norm set to 10. Note that when the adaptation algorithm A has a range of possible steps, the number of steps is sampled uniformly from the range for every adaptation. For HVAE + Hyper CLIP guidance and HVAE + Hyper LDM, we trained a VAE for 2000 epochs... with the Adam (Kingma & Ba, 2017) optimizer and 0.0001 learning rate and batch size 32... To train the Hyper CLIP model... We trained our Hyper CLIP model for 600 epochs with the Adam (Kingma & Ba, 2017) optimizer, 0.0003 learning rate, and batch size 64 for all our experiments. We parametrize the diffusion process with a linear noise schedule, β starting at 0.0001 and ending at 0.06, and 350 diffusion timesteps. For all our experiments, we train the Hyper LDM for 1000 epochs with the Adam optimizer, 0.00025 learning rate, and 128 epochs.