Geodesic Optimization for Predictive Shift Adaptation on EEG data

Authors: Apolline Mellot, Antoine Collas, Sylvain Chevallier, Alex Gramfort, Denis A. Engemann

NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We performed empirical benchmarks on the cross-site generalization of age-prediction models with resting-state EEG data from a large multi-national dataset (Har MNq EEG), which included 14 recording sites and more than 1500 human participants. Compared to state-of-the-art methods, our results showed that GOPSA achieved significantly higher performance on three regression metrics (R2, MAE, and Spearman s ρ) for several source-target site combinations, highlighting its effectiveness in tackling multi-source DA with predictive shifts in EEG data analysis.
Researcher Affiliation Collaboration Apolline Mellot , Antoine Collas Inria, CEA, Université Paris-Saclay Palaiseau, France EMAIL EMAIL Sylvain Chevallier TAU Inria, LISN-CNRS, University Paris-Saclay, France. sylvain.chevallier@ universite-paris-saclay.fr Alexandre Gramfort Inria, CEA, Université Paris-Saclay Palaiseau, France EMAIL Denis A. Engemann Roche Pharma Research and Early Development, Neuroscience and Rare Diseases, Roche Innovation Center Basel, F. Hoffmann La Roche Ltd., Basel, Switzerland. EMAIL
Pseudocode Yes Algorithm 1: Train-Time GOPSA; Algorithm 2: Test-Time GOPSA
Open Source Code Yes The dataset Har MNq EEG [ 33 ] is in open access. We provide the code to reproduce the experiments from the raw data.
Open Datasets Yes The Har MNq EEG dataset [ 33 ] was used for our numerical experiments. This dataset includes EEG recordings collected from 1564 participants across 14 different study sites, distributed across 9 countries. In our analysis, we consider each study site as a distinct domain.
Dataset Splits Yes For each source-target combination we performed a stratified shuffle split approach with 100 repetitions on the target data. Stratification was based on the recording sites to ensure that each split contained a balanced proportion of participants from each site. The regularization parameter λ in Ridge regression was selected with a nested cross-validation (grid search) over a logarithmic grid of values from 10 1 to 105. To evaluate the benefit of GOPSA, we compared it against four baselines.
Hardware Specification Yes Experiments with 100 repetitions and all site combinations have been run on a standard Slurm cluster for 12 hours with 250 CPU cores.
Software Dependencies Yes Numerical computation was enabled by the scientific Python ecosystem: Matplotlib [ 27 ], Scikit-learn [ 42 ], Numpy [ 21 ], Scipy [ 54 ], Py Torch [ 41 ] Py Riemann [ 3 ], MNE [ 19 ] and SKADA [ 18 ]. Specifically, Py Riemann [ 3 ] is cited as "v0.3, July 2022" and SKADA [ 18 ] as "7 2024".
Experiment Setup Yes The regularization parameter λ in Ridge regression was selected with a nested cross-validation (grid search) over a logarithmic grid of values from 10 1 to 105. In practice, we use L-BFGS and obtain the gradient using automatic differentiation through the Ridge solution that is plugged into the loss in ( 8 ).