KeyPose: Category-Level 6D Object Pose Estimation with Self-Adaptive Keypoints

Authors: Sheng Yu, Di-Hua Zhai, Yuanqing Xia

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on multiple benchmark datasets demonstrate that our proposed Key Pose method outperforms all existing methods, achieving over 20% improvement in pose accuracy on some critical datasets. We conduct experiments on benchmark datasets for category-level object pose estimation, including CAMERA25 and REAL275 datasets. Ablation Studies To validate the effectiveness of the proposed module, we first conduct a series of ablation experiments on the module.
Researcher Affiliation Academia 1School of Automation, Beijing Institute of Technology, Beijing, China 2Zhongyuan University of Technology, Zhengzhou, Henan, China EMAIL, EMAIL, xia EMAIL.
Pseudocode No The paper describes the methodology using prose, diagrams, and mathematical equations but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about the release of source code or a link to a code repository.
Open Datasets Yes The benchmark datasets used for category-level object pose estimation are the REAL275 dataset and the CAMERA25 dataset (Wang et al. 2019b). House Cat6D (Jung et al. 2024) is a comprehensive real-world dataset that includes 194 highfidelity 3D models of household items across 10 categories.
Dataset Splits Yes The CAMERA25 dataset comprises 300K images, with 25K images designated for evaluation. It is created by integrating synthetic objects into real-world scenes. In contrast, the REAL275 dataset consists of 4300 real-world training images from 7 scenes and 2750 real-world evaluation images from 6 scenes.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions several tools and backbones used (e.g., Mask RCNN, DINOv2, Point Net++, DGCNN) but does not provide specific version numbers for these or any other software dependencies crucial for replication.
Experiment Setup Yes The total loss is L = λ1Ld + λ2LD + λ3Lk NOCS + λ4LP ose (12) where λ indicates the hyperparameter. We set λ1 = 1, λ2 = 5, λ3 = 1, λ4 = 0.3. From the experimental results, it can be seen that when we choose 96 keypoints, the overall performance of the network reaches its peak. According to the experimental results, we can see that the performance reaches its peak when we choose 32 groups for sampling.