Modular Quantization-Aware Training for 6D Object Pose Estimation

Authors: Saqib Javed, Chengkun Li, Andrew Lawrence Price, Yinlin Hu, Mathieu Salzmann

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We first introduce the datasets and metrics used for evaluation. Then, we present ablation studies to explore the properties of MQAT; this result is directly compared to uniform and mixed QAT methods. Finally, we demonstrate the generality of our method applied to different datasets, architectures, and QAT methods.
Researcher Affiliation Collaboration Saqib Javed1 , Chengkun Li1, Andrew Price1, Yinlin Hu2, Mathieu Salzmann1 1CVLab, EPFL 2Magic Leap
Pseudocode Yes We introduce MQAT in algorithm 1 where we have also defined the notations.
Open Source Code Yes Additionally, we observe that MQAT quantized models can achieve an accuracy boost (> 7% ADI-0.1d) over the baseline full-precision network while reducing model size by a factor of 4 or more. https://saqibjaved1.github.io/MQAT_
Open Datasets Yes The LINEMOD (LM) and Occluded-LINEMOD (LM-O) datasets are standard BOP benchmark datasets for evaluating 6D object pose estimation methods... Conversely, the Swiss Cube dataset (Hu et al., 2021b) embodies a challenging scenario for 6D object pose estimation in space... Our evaluation was conducted on the comprehensive COCO dataset, a benchmark for object detection.
Dataset Splits Yes Similar to GDR-Net (Wang et al., 2021), we utilize 15% of the images for training. For both datasets, additional rendered images are used during training (Wang et al., 2021; Peng et al., 2019)... The Next Generation Spacecraft Pose Estimation Dataset (Speed+) addresses the the domain gap challenge in spacecraft pose estimation. it encompasses 60,000 synthetic images, divided into an 80:20 train-to-validation ratio.
Hardware Specification Yes Running WDR on our Intel Core i7-9750H CPU demonstrates a latency of 650ms for full precision and 299ms for our MQAT int8 quantized model.
Software Dependencies No We use Py Torch to implement our method. For the retraining of our partially quantized pretrained network, we employ an SGD optimizer with a base learning rate of 1e-2. It is common practice for quantization algorithms to start with a pre-trained model (Esser et al., 2020; Dong et al., 2019; Zhou et al., 2017); we similarly do so here.
Experiment Setup Yes We use Py Torch to implement our method. For the retraining of our partially quantized pretrained network, we employ an SGD optimizer with a base learning rate of 1e-2... We employed 30 epochs to identify the starting module at 2bits (Alg. 1 Lines 1-7) and then trained 30 epochs per module (Alg. 1 Lines 18-21)... For all experiments, we use a batch size of 8 and employ a hand-crafted learning scheduler which decreases the learning rate at regular intervals by a factor of 10... We use a 512 512 resolution input for the Swiss Cube dataset and 640 480 for LM and LM-O as in Peng et al. (2019).