RhythmMamba: Fast, Lightweight, and Accurate Remote Physiological Measurement

Authors: Bochao Zou, Zizheng Guo, Xiaocheng Hu, Huimin Ma

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Rhythm Mamba achieves stateof-the-art performance with 319% throughput and 23% peak GPU memory. We conduct extensive experiments on intra-dataset and cross-dataset scenarios. The results demonstrate that Rhythm Mamba achieves state-of-the-art performance with 319% throughput and 23% GPU memory, as illustrated in Fig. 2. We conducted intra-dataset evaluation on the PURE and UBFC datasets to validate the feasibility of the Mamba architecture.
Researcher Affiliation Academia 1University of Science and Technology Beijing, Beijing, China 2China Academy of Electronics and Information Technology, Beijing, China
Pseudocode No The paper describes methods in prose and through diagrams (e.g., Figure 3 shows the framework of Rhythm Mamba), but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/zizheng-guo/Rhythm Mamba
Open Datasets Yes The experiments of remote physiological measurement were conducted on four publicly available datasets: PURE (Stricker, M uller, and Gross 2014), UBFC-r PPG (Bobbia et al. 2019), VIPL-HR(Niu et al. 2019), and MMPD (Tang et al. 2023).
Dataset Splits Yes For the evaluation of the PURE dataset, we followed the protocols outlined in (Lu, Han, and Zhou 2021), splitting the dataset sequentially into training and testing sets with a ratio of 6:4. Similarly, for the evaluation of the UBFC dataset, we followed the protocols in (Lu, Han, and Zhou 2021), selecting the first 30 samples as the training set and the remaining 12 samples as the testing set. For the VIPLHR dataset, we followed the subject-exclusive 5-fold crossvalidation protocol, as outlined in (Niu et al. 2019; Yu et al. 2022). For the MMPD dataset, following the protocols outlined in (Zou et al. 2024), the dataset was sequentially split into training, validation, and testing sets with a ratio of 7:1:2. The training dataset was sequentially split into training and validation sets with a ratio of 8:2.
Hardware Specification Yes The experiment was conducted on NVIDIA RTX 3090.
Software Dependencies No The proposed Rhythm Mamba was implemented based on Py Torch, and we utilized an open-source r PPG toolbox (Liu et al. 2023b) to conduct a fair comparison against several state-of-the-art methods. While software components are mentioned, specific version numbers for PyTorch or the r PPG toolbox are not provided.
Experiment Setup Yes In the pre-processing, video inputs were divided into segments of 160 frames. Facial recognition was applied on the first frame of each segment, followed by cropping and resizing of the facial region. These adjustments were then maintained throughout the subsequent frames. In the post-processing, a second-order Butterworth filter (cutoff frequencies: 0.75 and 2.5 Hz) was applied to filter the r PPG waveform, and power spectral density was computed by the Welch algorithm for further heart rate estimation. Following the protocol outlined in (Yu et al. 2020), random upsampling, downsampling, and horizontal flipping were applied for data augmentation. Loss. We employed a loss function that integrates constraints from both the temporal and frequency domains (Yu et al. 2020). The negative Pearson correlation coefficient is utilized as temporal constraint LT ime, while crossentropy between the power spectral density of prediction and the HR derived from the power spectral density of ground truth, is employed as frequency constraint LF req. LF req = CE(max Index(PSD(PPGgt)), PSD(PPGpred)), where PSD represents Power Spectral Density and max Index represents the index of the maximum value. The overall loss is expressed by: Loverall = a LT ime+b LF req.