reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Restructuring Vector Quantization with the Rotation Trick

Authors: Christopher Fifty, Ronald Junkins, Dennis Duan, Aniketh Iyengar, Jerry Liu, Ehsan Amid, Sebastian Thrun, Christopher Re

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Across 11 different VQ-VAE training paradigms, we find this restructuring improves reconstruction metrics, codebook utilization, and quantization error. ... In this section, we evaluate the effect of the rotation trick across many different VQ-VAE paradigms. ... Tables 1-5 display experimental results with various metrics such as r-FID, r-IS, and codebook usage.
Researcher Affiliation	Collaboration	1Stanford University, 2Google Deep Mind
Pseudocode	Yes	Algorithm 1 The Rotation Trick Require: input example x e Encoder(x) q nearest codebook vector to e R rotation matrix that aligns e to q q stop-gradient h q e R i e x Decoder( q) loss L(x, x) return loss
Open Source Code	Yes	Our code is available at https://github.com/cfifty/rotation_trick.
Open Datasets	Yes	We begin with a straightforward evaluation: training a VQ-VAE to reconstruct examples from Image Net (Deng et al., 2009). ... VQGANs ... on Image Net and the combined dataset FFHQ (Karras et al., 2019) and Celeb A-HQ (Karras, 2017). ... video reconstructions from the BAIR Robot dataset (Ebert et al., 2017) and from the UCF101 action recognition dataset (Soomro, 2012).
Dataset Splits	Yes	We log both training and validation set reconstruction metrics. Of note, we compute reconstruction FID (Heusel et al., 2017) and reconstruction IS (Salimans et al., 2016) on reconstructions from the full Image Net validation set as a measure of reconstruction quality.
Hardware Specification	No	The paper mentions 'Due to GPU VRAM constraints' in Appendix A.10.4, but does not provide specific GPU models, CPU models, or other detailed hardware specifications used for running the experiments.
Software Dependencies	No	The paper references specific GitHub repositories for implementations (e.g., 'https://github.com/lucidrains/vector-quantize-pytorch', 'https://github.com/CompVis/taming-transformers') and Hugging Face, but does not list explicit version numbers for general software libraries like Python, PyTorch, CUDA, etc.
Experiment Setup	Yes	A complete description of both training settings is provided in Table 9 of the Appendix. ... Table 8 summarizes the hyperparameters used for the experiments in Section 5.1. ... Table 10: Hyperparameters for the experiments in Table 4. ... Table 11: Hyperparameters for the experiments in Table 5.