reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Joint Diffusion for Universal Hand-Object Grasp Generation

Authors: Jinkun Cao, Jingyuan Liu, Kris Kitani, Yi Zhou

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	According to both qualitative and quantitative experiments, both conditional and unconditional generation of hand grasp achieves good visual plausibility and diversity.
Researcher Affiliation	Collaboration	Jinkun Cao* EMAIL Carnegie Mellon University Jingyuan Liu EMAIL Adobe Kris Kitani EMAIL Carnegie Mellon University Yi Zhou* EMAIL Roblox
Pseudocode	No	The paper describes the methodology using mathematical equations and prose, but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The text discusses using Adobe Firefly as an existing tool for image generation and mentions other third-party tools/models but does not provide specific access information (e.g., a repository link or explicit release statement) for the authors' own methodology described in the paper.
Open Datasets	Yes	Datasets. We combine the data from multiple resources to train the model. GRAB (Taheri et al., 2020) contains human full-body poses together with 3D objects. For Oak Ink (Yang et al., 2022), we use the official training split for training. We also use the contact-adapted synthetic grasp from the Oak Ink-Shape dataset for training. Besides the hand-object interaction data, we also leverage the rich resources of 3D object data to help train the object part in our model. [...] LION learns from a much larger basis, i.e., more than 50,000 objects in Shape Net (Chang et al., 2015). [...] We hold the objects from the Oak Ink-Shape test set and ARCTIC (Fan et al., 2023) dataset for quantitative evaluations.
Dataset Splits	Yes	We use the official training split for training. We hold the objects from the Oak Ink-Shape test set and ARCTIC (Fan et al., 2023) dataset for quantitative evaluations. We train the models on the GRAB and Oak Ink-shape train splits. The results are shown in Table 1.
Hardware Specification	No	The paper mentions running experiments and training models but does not specify any hardware details such as GPU models, CPU types, or memory used.
Software Dependencies	No	The paper mentions various software components and models (e.g., MANO, LION, MDM, Adobe Firefly) but does not provide specific version numbers for any of them. For example, it says 'we follow MDM (Tevet et al., 2022) to use a transformer encoder-only backbone-based diffusion network', but not 'MDM vX.Y'.
Experiment Setup	Yes	For the encoder network to transform modality features to latent codes, we always use 2-layer MLP networks with a hidden dimension of 1024 and an output dimension of 512. For the denoiser, we follow MDM (Tevet et al., 2022) to use a transformer encoder-only backbone-based diffusion network. In practice, we combine these two training objectives in a 1:1 ratio for a single draw of training data batch.