PaPaGei: Open Foundation Models for Optical Physiological Signals
Authors: Arvind Pillai, Dimitris Spathis, Fahim Kawsar, Mohammad Malekzadeh
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate PAPAGEI against state-of-the-art time-series foundation models and self-supervised learning benchmarks across 20 tasks from 10 diverse datasets, spanning cardiovascular health, sleep disorders, pregnancy monitoring, and wellbeing assessment. Our model demonstrates superior performance, improving classification and regression metrics by 6.3% and 2.9% respectively in at least 14 tasks. Notably, PAPAGEI achieves these results while being more dataand parameter-efficient, outperforming models that are 70 larger. Beyond accuracy, we examine model robustness across different skin tones, establishing a benchmark for bias evaluation in future models. |
| Researcher Affiliation | Collaboration | Arvind Pillai2 , Dimitris Spathis1,3, Fahim Kawsar1,4, Mohammad Malekzadeh1 1Nokia Bell Labs, Cambridge, UK, 2Dartmouth College, NH, USA, 3University of Cambridge, UK, 4University of Glasgow, Scotland, UK |
| Pseudocode | No | The paper describes its methodology using descriptive text, mathematical equations (1-5), and a flow diagram (Figure 2), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Models, data, and code are available at: github.com/nokia-bell-labs/papagei-foundation-model |
| Open Datasets | Yes | The model is pre-trained on over 57,000 hours of data, comprising 20 million unlabeled PPG segments from publicly available datasets. ...To our knowledge, PAPAGEI is the first open foundation model pre-trained on PPG signals, using 57,000 hours of data from 20 million signals sourced entirely from public datasets. ...Databases used: Vital DB (Lee et al., 2022), MIMIC-III (Johnson et al., 2016), MESA (Zhang et al., 2018; Chen et al., 2015), nu Mom2B (Facco et al., 2015), VV (Skin Tone) (Toye, 2023), PPG-BP (Liang et al., 2018a), SDB (Garde et al., 2014), ECSMP (Gao et al., 2021), WESAD (Schmidt et al., 2018), PPG-Da Li A (Reiss et al., 2019). |
| Dataset Splits | Yes | We initially split the in-domain and out-of-domain datasets into training, validation, and test sets using 80/10/10 and 60/20/20 ratios at the participant-level, respectively. Hyperparameter optimization is performed on the training set using nested cross-validation, thus the validation and test sets are merged for evaluation. |
| Hardware Specification | Yes | We set α = 0.6 and train on eight V100 GPUs for 15,000 steps (lr= 10 4), with PAPAGEI-P and PAPAGEI-S having 5M and 5.7M parameters, respectively, while previous works use model sizes of 3.3M (Abbaspourazad et al., 2023) (we study scaling in Section 5.2). |
| Software Dependencies | No | For model training, we primarily used Py Torch (Paszke et al., 2019). The NTXent Loss implementation was sourced from the Py Torch Metric Learning package. The paper mentions PyTorch with a citation and a PyTorch Metric Learning package, but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We adopt a Res Net-style CNN encoder, following (Ding et al., 2024). ...Our model has 18 convolutional blocks, starting with a filter size of 32, which doubles every 4 blocks. The projection layer is a single FC layer, generating a 512-d embedding. In the PAPAGEI-S variant, the expert block (M1 & M2) uses three parallel FCNNs, each with two FC layers, resulting in a 128d embedding. For augmentations, PAPAGEI-P uses cropping (0.50), negation (0.20), flipping (0.20), and scaling (0.40). PAPAGEI-S uses cropping (0.25) and Gaussian noise (0.25). ...We set α = 0.6 and train on eight V100 GPUs for 15,000 steps (lr= 10 4), with PAPAGEI-P and PAPAGEI-S having 5M and 5.7M parameters, respectively... We use a batch size of 128 for training... |