Modular Quantization-Aware Training for 6D Object Pose Estimation
Authors: Saqib Javed, Chengkun Li, Andrew Lawrence Price, Yinlin Hu, Mathieu Salzmann
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first introduce the datasets and metrics used for evaluation. Then, we present ablation studies to explore the properties of MQAT; this result is directly compared to uniform and mixed QAT methods. Finally, we demonstrate the generality of our method applied to different datasets, architectures, and QAT methods. |
| Researcher Affiliation | Collaboration | Saqib Javed1 , Chengkun Li1, Andrew Price1, Yinlin Hu2, Mathieu Salzmann1 1CVLab, EPFL 2Magic Leap |
| Pseudocode | Yes | We introduce MQAT in algorithm 1 where we have also defined the notations. |
| Open Source Code | Yes | Additionally, we observe that MQAT quantized models can achieve an accuracy boost (> 7% ADI-0.1d) over the baseline full-precision network while reducing model size by a factor of 4 or more. https://saqibjaved1.github.io/MQAT_ |
| Open Datasets | Yes | The LINEMOD (LM) and Occluded-LINEMOD (LM-O) datasets are standard BOP benchmark datasets for evaluating 6D object pose estimation methods... Conversely, the Swiss Cube dataset (Hu et al., 2021b) embodies a challenging scenario for 6D object pose estimation in space... Our evaluation was conducted on the comprehensive COCO dataset, a benchmark for object detection. |
| Dataset Splits | Yes | Similar to GDR-Net (Wang et al., 2021), we utilize 15% of the images for training. For both datasets, additional rendered images are used during training (Wang et al., 2021; Peng et al., 2019)... The Next Generation Spacecraft Pose Estimation Dataset (Speed+) addresses the the domain gap challenge in spacecraft pose estimation. it encompasses 60,000 synthetic images, divided into an 80:20 train-to-validation ratio. |
| Hardware Specification | Yes | Running WDR on our Intel Core i7-9750H CPU demonstrates a latency of 650ms for full precision and 299ms for our MQAT int8 quantized model. |
| Software Dependencies | No | We use Py Torch to implement our method. For the retraining of our partially quantized pretrained network, we employ an SGD optimizer with a base learning rate of 1e-2. It is common practice for quantization algorithms to start with a pre-trained model (Esser et al., 2020; Dong et al., 2019; Zhou et al., 2017); we similarly do so here. |
| Experiment Setup | Yes | We use Py Torch to implement our method. For the retraining of our partially quantized pretrained network, we employ an SGD optimizer with a base learning rate of 1e-2... We employed 30 epochs to identify the starting module at 2bits (Alg. 1 Lines 1-7) and then trained 30 epochs per module (Alg. 1 Lines 18-21)... For all experiments, we use a batch size of 8 and employ a hand-crafted learning scheduler which decreases the learning rate at regular intervals by a factor of 10... We use a 512 512 resolution input for the Swiss Cube dataset and 640 480 for LM and LM-O as in Peng et al. (2019). |