HandDiffuse: Generative Controllers for Two-Hand Interactions via Diffusion Models
Authors: Pei Lin
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method outperforms the state-of-the-art techniques in motion generation. We conduct several experiments to evaluate our method Hand Diffuse for the task of interacting hands motion generation. We particularly evaluate (1) the comparison of our method against previous state-of-the-art approaches, and (2) the ablation study. |
| Researcher Affiliation | Academia | Shanghai Tech University, School of Information Science and Technology, Shanghai, China EMAIL |
| Pseudocode | No | The paper describes the methodology in prose and mathematical equations but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Our dataset, related codes will be made publicly available to stimulate further research. |
| Open Datasets | Yes | Datasets https://handdiffuse.github.io/ We contribute a large-scale real dataset named Hand Diffuse12.5M, which provides accurate and temporally consistent tracking of human hands under diverse and strong two-hand interactions. |
| Dataset Splits | No | The paper introduces a new dataset, Hand Diffuse12.5M, and discusses its characteristics and use in experiments, but does not explicitly state the training, validation, and test splits used for evaluation within the main text. |
| Hardware Specification | Yes | All of them have been trained on a single NVIDIA Ge Force RTX 3090 GPU for about two days. |
| Software Dependencies | No | The paper describes the use of various models and frameworks (e.g., DWPose, MANO, diffusion models) but does not provide specific version numbers for software dependencies or programming languages used for implementation. |
| Experiment Setup | Yes | The models have been trained with T = 1000 diffusing steps and a cosine noise schedule. The denoiser has 8 layers, 4 heads and the feedforward dimension is 1024. We set the length of generated motion N to 200 in all experiments. |