EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping

Authors: Sam Gijsen, Kerstin Ritter

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our multimodal models significantly improve over EEG-only models across four clinical evaluations and for the first time enable zero-shot classification as well as retrieval of both neural signals and reports.
Researcher Affiliation Academia 1Charit e Universit atsmedizin Berlin, Department of Psychiatry and Psychotherapy, Berlin, Germany 2Hertie Institute for AI in Brain Health, University of T ubingen, Germany. Correspondence to: Sam Gijsen <EMAIL>.
Pseudocode No The paper describes its methodology in prose and mathematical equations (e.g., Equations 1-13) but does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes We provide code and pretrained models at https://github.com/Sam Gijsen/ELM.
Open Datasets Yes TUEG. The Temple University Hospital (TUH) EEG Corpus is the largest available corpus of hospital EEG data with varying montages, channel counts, and sampling frequencies (n=26846 (Obeid & Picone, 2016)). ... The data used in this study was provided by the Neural Engineering Data Consortium at Temple University. For further details about this data, please access the following URL: https://isip.piconepress.com/ projects/tuh_eeg/html/.
Dataset Splits Yes TUAB. ... Following the literature, we use the provided evaluation set as the hold-out test set. NMT. ... We use the provided train/test split. TUSZ. ... We perform binary classification using 5-fold cross validation on the provided train and dev sets (n=6491), while testing on the eval set. TUEV. ... We only use the provided train set (5-fold CV) ... For linear evaluation, we train logistic linear regression models using 10-fold cross validation for each pretrained model...
Hardware Specification Yes Models were trained on either an Nvidia Geforce GTX 3090 or Tesla V100 GPU and require less than 24GB of memory.
Software Dependencies Yes We used CUDA v11.3 and Py Torch v1.12.1. For linear evaluation, we train logistic linear regression models using sklearn (Pedregosa et al., 2011). EEG data received minimal preprocessing (using MNE (Gramfort et al., 2013)).
Experiment Setup Yes All models are pretrained using the LARS optimizer (You et al., 2017) with a cosine decay learning rate schedule over 50 epochs, with a warm-up of 4 epochs. The base learning rate is set to 0.3 for EEG-only, 0.01 for ELMs, and 0.06 for ELM-MIL, scaled with the batch size (Base LR Batch Size/256; (Grill et al., 2020)). ... We use a weight-decay parameter of 1 10 4. ... We set the temperature parameter τ to 0.3 for all further analyses. ... For the supervised learning baseline, we use the identical EEG encoder backbone as used for all other analyses and use 60 second crops. ... The ADAM learning rate is set to 0.001 and we use the validation set to select weight decay out of [0.1, 0.01, 0.0001]. We use a batch size of 256 and train using the cross entropy loss.