RAGG: Retrieval-Augmented Grasp Generation Model

Authors: Zhenhua Tang, Bin Zhu, Yanbin Hao, Chong-Wah Ngo, Richang Hong

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the Oak Ink and GRAB benchmarks demonstrate that RAGG achieves superior results compared to the state-of-the-art approach, indicating not only better physical feasibility and controllability, but also strong generalization and interpretability for unseen objects.
Researcher Affiliation Academia 1Hefei University of Technology, Anhui, China 2Singapore Management University, Singapore 3University of Science and Technology of China, Anhui, China
Pseudocode No The paper describes the methods textually and visually through Figure 2, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository.
Open Datasets Yes Extensive experiments on the Oak Ink and GRAB benchmarks demonstrate that RAGG achieves superior results compared to the state-of-the-art approach, indicating not only better physical feasibility and controllability, but also strong generalization and interpretability for unseen objects. [...] Oak Ink is a large-scale knowledge repository for understanding hand-object interactions, which contains 1800 object models of 32 categories. [...] GRAB contains real human grasps for 51 objects from 10 different subjects.
Dataset Splits Yes Building on the previous protocol (Yang et al. 2022), we use 9 categories of objects (i.e., bottle, camera, cylinder bottle, eyeglasses, game controller, lotion pump, mug, pen, and trigger sprayer) with 2 intents (i.e., use and hold) for training and testing, making our setup more challenging. To evaluate the generalization ability , we further select 3 unseen object categories: bowl, headphone, and knife for testing.
Hardware Specification No The paper does not provide any specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions several components like 'MANO layer', 'GPT-3', 'GPT-4', and 'KAN', but it does not specify any version numbers for these or any other software dependencies.
Experiment Setup Yes Lrec. = λ1 y0 y0 2 + λ2 e0 e0 2 , (7) where e0 and e0 denote the edges between pairs of vertices of the MANO grasp, with λ1 = 34.825 and λ2 = 29.85. The contact losses are given by: Lh20 = λ3 h2o( y0, O) h2o(y0, O) 2 , (8) Lo2h = λ4 o2h( y0, O) o2h(y0, O) 2 , (9) where h2o denotes the signed distance from every vertex on the MANO hand to the object mesh, and o2h denotes the signed distance from every vertex on the object mesh to the hand mesh, with λ3 = 34.825 and λ4 = 29.85. [...] where C1 and C2 are set to 0.0001 and 0.0009, respectively, to avoid numerical instability as the denominator approaches zero. [...] Finally, in line with previous works (Taheri et al. 2020; Yang et al. 2022), we optimize the network by applying contact, reconstruction, and SSIM losses over three iterations. In this paper, we set v to 1.