Scaling Wearable Foundation Models
Authors: Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, Shun Liao, Jake Garrison, Shyam Tailor, Jacob Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, Daniel McDuff
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, accelerometer, electrodermal activity, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM, a multimodal foundation model built on the largest wearable-signals dataset with the most extensive range of sensor modalities to date. Our results establish the scaling laws of LSM for tasks such as imputation, interpolation and extrapolation across both time and sensor modalities. Moreover, we highlight how LSM enables sample-efficient downstream learning for tasks including exercise and activity recognition. |
| Researcher Affiliation | Collaboration | Co-first, Corresponding, 1Google Research, 2Google Deep Mind, 3University of Washington EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods in regular paragraph text and figures, but does not include any structured pseudocode or algorithm blocks labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | The paper mentions building methods upon the Scenic project (Dehghani et al., 2022) and provides its GitHub link: 'github.com/google-research/scenic'. However, this refers to a third-party codebase used by the authors, not the explicit release of the source code for the methodology described in this paper (LSM). |
| Open Datasets | No | We support open science principles and the value of open data for scientific research; however, we have to balance these considerations with the privacy of the participants and protection of their health data. Although the training data could be de-identified, some of the data streams could not be fully anonymized. We recognize that the inability to share data of this kind is a limitation; however we believe that the presented results enable us to share valuable insights with the community. |
| Dataset Splits | Yes | The dataset was split 80-20, based on subjects, into train-test splits (132072 subjects in training, 33018 subjects in testing) as described in Table 2(a). |
| Hardware Specification | Yes | We pretrain our models on Google v5e TPUs with a total batch size of 4,096 across 50,000 training steps. |
| Software Dependencies | No | The paper mentions using the Adam W optimizer and building methods upon the Scenic project, implemented in JAX with Flax. However, it does not provide specific version numbers for these software components (e.g., JAX version, Flax version, or the specific AdamW implementation version) or any other libraries. |
| Experiment Setup | Yes | We pretrain our models on Google v5e TPUs with a total batch size of 4,096 across 50,000 training steps. The training process uses the Adam W optimizer with a base learning rate of 5e 3 and weight decay set to 1e 4. A linear warm-up schedule is applied for the first 2,500 steps, followed by a cosine learning rate decay to zero. All pretraining experiments use an 0.8 masking ratio (masking out random patches that cover 80% of the total input signals). |