Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers

Authors: Yiwei Lu, Yaoliang Yu, Xinlin Li, Vahid Partovi Nia

NeurIPS 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental conduct image classification experiments on CNNs and vision transformers, and empirically verify that BNN++ generally achieves competitive results on binarizing these models.
Researcher Affiliation Collaboration Yiwei Lu School of Computer Science University of Waterloo Vector Institute EMAIL Yaoliang Yu School of Computer Science University of Waterloo Vector Institute EMAIL Xinlin Li Huawei Noah s Ark Lab EMAIL Vahid Partovi Nia Huawei Noah s Ark Lab EMAIL
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. Figure 1 shows a visualization of forward and backward passes, but it is not pseudocode.
Open Source Code No The paper does not include an explicit statement or link providing access to the open-source code for their proposed methodology.
Open Datasets Yes Datasets: We perform image classification on CIFAR-10/100 datasets [27] and Image Net-1K dataset [28].
Dataset Splits No The paper uses standard datasets (CIFAR-10/100, Image Net-1K) which implies standard splits, but it does not explicitly state the training/validation/test dataset splits (e.g., percentages or sample counts) within the paper.
Hardware Specification Yes Hardware and package: All experiments were run on a GPU cluster with NVIDIA V100 GPUs.
Software Dependencies No The paper mentions 'The platform we use is Py Torch. Specifically, we apply Vi T and Dei T models implemented in Pytorch Image Models (timm)', but it does not specify version numbers for PyTorch or timm.
Experiment Setup Yes Hyperparameters: We apply the same training hyperparameters and fine-tune/end-to-end training for 100/300 epochs across all models. For binarization methods: (1) PQ (Prox Quant): similar to Bai et al. [2], we apply the Linear Quantizer (LQ), see (10) in Appendix A, with initial ρ0 = 0.01 and linearly increase to ρT = 10; (2) r PC (reverse Prox Connect): we use the same LQ for r PC; (3) Prox Connect++: for PC, we apply the same LQ; for BNN+, we choose µ = 5 (no need to increase µ as the forward quantizer is sign); for BNN++, we choose µ0 = 5 and linearly increase to µT = 30 to achieve binarization at the final step.