HyperVQ: MLR-based Vector Quantization in Hyperbolic Space
Authors: Nabarun Goswami, Yusuke Mukuta, Tatsuya Harada
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To assess the effectiveness of our proposed method, we conducted experiments across diverse tasks, encompassing image reconstruction, generative modeling, image classification, and feature disentanglement. The implementation details and results of each experiment are elaborated upon in the following subsections. |
| Researcher Affiliation | Academia | Nabarun Goswami EMAIL The University of Tokyo Yusuke Mukuta EMAIL The University of Tokyo, RIKEN Tatsuya Harada EMAIL The University of Tokyo, RIKEN |
| Pseudocode | Yes | Algorithm 1 Hyper VQ Training |
| Open Source Code | No | No explicit statement or link to the paper's source code is provided. The paper mentions using third-party libraries (geoopt) and referencing existing implementations (hyperbolic neural networks++), but not providing its own. |
| Open Datasets | Yes | To assess the generative modeling capabilities of the proposed Hyper VQ method, we conducted experiments on the Cifar100 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015) datasets. ... we trained a Hyper VQVAE model on the simple MNIST dataset. ... incorporating corruptions from the Image Net-C dataset (Hendrycks & Dietterich, 2019) for data augmentation. ... conducted pre-training on the Libri Speech 960h dataset. |
| Dataset Splits | Yes | To assess the generative modeling capabilities of the proposed Hyper VQ method, we conducted experiments on the Cifar100 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015) datasets. ... Table 1: Comparison of reconstruction mean squared error (MSE) for Kmeans VQ and Hyper VQ on Cifar100 (test) and Image Net (validation) datasets for different codebook sizes, K. ... Figure 3a displays some reconstructions achieved with Hyper VQ on the Image Net validation set (128 128) ... Classification accuracy is reported on the clean validation set of Image Net, along with accuracy on known and unknown corruptions in the Image Net-C dataset. ... conducted pre-training on the Libri Speech 960h dataset. |
| Hardware Specification | Yes | Model training was conducted on 4 A100 GPUs utilizing the Adam optimizer (Kingma & Ba, 2014), with a learning rate set at 3e-4 and a batch size of 128 per GPU, unless otherwise specified. |
| Software Dependencies | No | The paper mentions using the Py Torch library and the geoopt library but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | Model training was conducted on 4 A100 GPUs utilizing the Adam optimizer (Kingma & Ba, 2014), with a learning rate set at 3e-4 and a batch size of 128 per GPU, unless otherwise specified. For this experiment, in addition to Kmeans VQ and Hyper VQ, we also included Gumbel VQ for comparison. All models underwent training for 500 epochs, utilizing the same settings as described in Section 5.1. |