Meta-Learning Sparse Compression Networks

Authors: Jonathan Schwarz, Yee Whye Teh

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now provide an extensive empirical analysis of the MSCN framework in the two aforementioned instantiations. Our analysis covers a wide range of datasets and modalities ranging from images to manifolds, Voxels, signed distance functions and scenes. While network depth and width vary based on established best practices, in all cases we use SIREN-style (Sitzmann et al., 2020b) sinusoidal activations. Throughout the section, we make frequent use of the Peak signal-to-noise ratio (PSNR), a metric commonly used to quantify reconstruction quality.
Researcher Affiliation Industry Jonathan Richard Schwarz EMAIL Deep Mind University College London Yee Whye Teh EMAIL Deep Mind
Pseudocode No The paper describes methods in paragraph text and mathematical equations without explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or provide a link to a code repository.
Open Datasets Yes Experiments in this section focus on Images, covering the Celeb A (Liu et al., 2015) & Image Nette (Howard, 2022) datasets as well as Signed Distance Functions (Tancik et al., 2021)... We evaluate on Voxels using the top 10 classes of the Shape Net dataset (Chang et al., 2015), Ne RF scenes using SRN Cars (Sitzmann et al., 2019), manifolds using the ERA-5 dataset (Hersbach et al., 2019) and on images using the Celeb AHQ dataset (Karras et al., 2017). We provide a comparison with various compression schemes for CIFAR10 in Figure 8a and Kodak (for which we pre-train on the DIV2K dataset (Agustsson & Timofte, 2017) as in (Strümpler et al., 2021)) in Figure 8b.
Dataset Splits No The paper mentions several datasets but does not explicitly state the training, validation, and test splits (e.g., percentages, sample counts, or references to predefined splits for their experiments).
Hardware Specification No The paper mentions 'a single GPU' in the context of training time but does not specify the model or other hardware details used for the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) used for the experiments.
Experiment Setup Yes Closely following the Meta Sparse INR setup, the network for all tasks is a 4-layer MLP with 256 units, which we meta-train for 2 inner loop steps using a batch size of 3 examples. We use the entire image to compute gradients in the inner loop. In order to correct for the absence of Meta SGD (Li et al., 2017) in Meta Sparse INR, we also provide results using a fixed learning rate for SDF & Celeb A as a fair comparison...networks in this section are significantly deeper, comprising 15 layers of 512 units each. In accordance with Functa, we report results using Meta SGD with 3 inner-loop sets and varying batch sizes (see Appendix for details) due to memory restrictions.