PTMQ: Post-training Multi-Bit Quantization of Neural Networks
Authors: Ke Xu, Zhongcheng Li, Shanshan Wang, Xingyi Zhang
AAAI 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that PTMQ achieves comparable performance to existing state-of-the-art post-training quantization methods, while optimizing it speeds up by 100 compared to recent multi-bit quantization works. Code can be available at https://github.com/xuke225/PTMQ. |
| Researcher Affiliation | Academia | Ke Xu1,2, Zhongcheng Li2, Shanshan Wang1*, Xingyi Zhang1,3* 1Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University 2School of Artificial Intelligence, Anhui University, Hefei, China 3School of Computer Science and Technology, Anhui University, Hefei, China EMAIL EMAIL EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code can be available at https://github.com/xuke225/PTMQ. |
| Open Datasets | Yes | We assess the performance of the proposed PTMQ scheme on various CNN-based architectures (Res Net (He et al. 2016), Mobile Net V2 (Sandler et al. 2018), Reg Net (Radosavovic et al. 2020)) and transformer-based architectures (Vi T (Dosovitskiy et al. 2021), Dei T (Touvron et al. 2021)) on Image Net (Russakovsky et al. 2014) dataset. |
| Dataset Splits | No | The paper mentions using 'calibration data' but does not specify explicit training/validation/test splits (e.g., percentages or sample counts) or reference predefined standard splits for reproducibility. |
| Hardware Specification | Yes | The time measurement is carried out with NVIDIA 3090. |
| Software Dependencies | No | The paper mentions using other methods like QDrop and PTQ4ViT, but it does not provide specific version numbers for its own software dependencies such as Python, PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | No | The paper describes the overall optimization process and components like MFM and GD-Loss, but it does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations in the main text. |