Gaussian-Based Instance-Adaptive Intensity Modeling for Point-Supervised Facial Expression Spotting

Authors: Yicheng Deng, Hideaki Hayashi, Hajime Nagahara

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the SAMM-LV and CAS(ME)2 datasets demonstrate the effectiveness of our proposed framework. Code is available at https://github.com/KinopioIsAllIn/GIM.
Researcher Affiliation Academia Yicheng Deng, Hideaki Hayashi, Hajime Nagahara Osaka University EMAIL, EMAIL
Pseudocode No The paper describes the steps for constructing Gaussian distributions (Step 1, Step 2, Step 3) but does not present them in a formal pseudocode or algorithm block.
Open Source Code Yes Code is available at https://github.com/KinopioIsAllIn/GIM.
Open Datasets Yes We follow the protocol of MEGC2021 and validate our method on two datasets: SAMM-LV (Yap et al., 2020) and CAS(ME)2 (Qu et al., 2017).
Dataset Splits Yes We employ a leave-one-subject-out cross-validation strategy in the experiments.
Hardware Specification No The paper does not explicitly describe the hardware used for experiments, such as specific GPU or CPU models.
Software Dependencies No The model is trained by the Adam optimizer (Kingma, 2015) on both datasets for 100 epochs with a learning rate of 2.0 10 5 and a weight decay of 0.1. This mentions the Adam optimizer but does not specify software versions for libraries or frameworks like PyTorch or TensorFlow.
Experiment Setup Yes The model is trained by the Adam optimizer (Kingma, 2015) on both datasets for 100 epochs with a learning rate of 2.0 10 5 and a weight decay of 0.1. The coefficient δ for duration estimation is set to 1.2. For the multi-stage training, the epochs for each stage are 1, 4, and 95, respectively. kc is set to 16 for ME and 32 for Ma E, respectively. ks1 is set to 3 for MEs and 5 for Ma Es, ks2 is set to 2 for MEs and 4 for Ma Es. We set the loss weight λ to 0.1, 0.3, and 2.0 10 5 for SAMM-LV, and to 0.1, 2.5, and 1.4 10 4 for CAS(ME)2, respectively. The threshold θ for estimating the rough duration of each expression proposal decreases linearly from 0.8 to 0.5 over 30 epochs and then remains at 0.5 until the end.