HOGSA: Bimanual Hand-Object Interaction Understanding with 3D Gaussian Splatting Based Data Augmentation

Authors: Wentian Qu, Jiahe Li, Jian Cheng, Jian Shi, Chenyu Meng, Cuixia Ma, Hongan Wang, Xiaoming Deng, Yinda Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform our data augmentation on two benchmarks, H2O and Arctic, and verify that our method can improve the performance of the baselines. ... We evaluate our method on two main benchmarks H2O (Kwon et al. 2021) and Arctic (Fan et al. 2023), and the baseline performances are improved with our augmented dataset.
Researcher Affiliation Collaboration 1Institute of Software, Chinese Academy of Sciences 2University of Chinese Academy of Sciences 3Institute of Automation, Chinese Academy of Sciences 4Google
Pseudocode No The paper describes methods using textual descriptions and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No Project Page https://iscas3dv.github.io/HOGSA/. This is a project page, not a direct link to a source code repository, and there is no explicit statement about code release.
Open Datasets Yes We evaluate our method on Arctic (Fan et al. 2023) and H2O (Kwon et al. 2021). ... We are inspired by GANerated Hands (Mueller et al. 2018) and add COCO2017 (Lin et al. 2014) images as background
Dataset Splits Yes For Arctic, we select subjects except s03 and s05 to build the HOGS model to meet the data division of the interaction understanding task. We select the sequences grab 01 and use 01 to train HOGS, and cropped the original images to a resolution of 1400 1000. For H2O, we select subjects except subject4 to build the HOGS model. We select sequences except o2 scene for training HOGS and keep the original image resolution 1280 720.
Hardware Specification Yes Our method is trained on a Nvidia RTX4090 GPU.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes We train HOGS with 50,000 iterations, which costs an average of 10 GB of memory. We train SRM with a total of 150,000 iterations and cost 8 GB of memory. For the POM, each initial input needs to be iterated 200 times. We set hyperparameters λSSIM, λR, λC, λH, λP , λ1, λV GG to 0.2, 0.5, 1, 1, 17, 5, 0.03 respectively.