Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units
Authors: Youjia Wang, Yiwen Wu, Hengan Zhou, Hongyang Lin, Xingyue Peng, Jingyan Zhang, Yingsheng Zhu, YingWenQi Jiang, Yatu Zhang, Lan Xu, Jingya Wang, Jingyi Yu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results demonstrate that CAPUS reliably captures facial motion in conditions where visual-based methods struggle, including facial occlusions, rapid movements, and low-light environments. |
| Researcher Affiliation | Collaboration | 1Shanghai Tech University 2Lumi Ani Technology 3Deemos Technology EMAIL |
| Pseudocode | No | The paper describes the methodology in text and through a network architecture diagram (Fig. 3) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Both the dataset and the code are released to the community for comprehensive evaluations. |
| Open Datasets | Yes | CAPUS introduces the first facial IMU dataset, encompassing both IMU and visual signals from participants engaged in diverse activities such as multilingual speech, facial expressions, and emotionally intoned auditions. ... Both the dataset and the code are released to the community for comprehensive evaluations. |
| Dataset Splits | No | The paper mentions training and testing phases and refers to a 'test set', but does not provide specific details about the dataset splits (e.g., percentages or counts for training, validation, and test sets of their IMU-ARKit dataset). For an ablation study, it states 'Small Dataset: we train the network using 1/3 of the dataset.', but this is not the general split information. |
| Hardware Specification | Yes | We train and evaluate CAPUS on a single NVIDIA RTX3090 GPU. |
| Software Dependencies | No | The paper mentions using Adam as the optimizer but does not specify any programming languages, libraries, or frameworks with their version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) used for implementation. |
| Experiment Setup | Yes | We use Adam as the optimizer with a learning rate 2 10 4, α = 0.9, β = 0.999. We train and evaluate CAPUS on a single NVIDIA RTX3090 GPU. The training process takes 1 hours on all identities with paired data. ... T = 120 for both training and testing. To avoid jittering at inference, we set an overlap of 60 frames to ensure the network has sufficient prior information to accurately determine the initial state of the face within the time window. |