reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Population Transformer: Learning Population-level Representations of Neural Activity

Authors: Geeling Chau, Christopher Wang, Sabera Talukder, Vighnesh Subramaniam, Saraswati Soedarmadji, Yisong Yue, Boris Katz, Andrei Barbu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we find that our pretrained Pop T outperforms commonly used aggregation approaches (Ghosal & Abbasi-Asl, 2021), and is competitive with end-to-end trained methods (Zhang et al., 2024; Yang et al., 2024; You et al., 2019). Moreover, we find that these benefits hold even for subjects not seen during pretraining, indicating its usefulness for new subject decoding. We also show that the pretrained Pop T weights themselves reveal interpretable patterns for neuroscientific study. Finally, we demonstrate that our proposed framework is agnostic to the underlying temporal encoder, further allowing it to adapt to other neural recording modalities. Tables 1, 2 and Figures 2, 3, 4, 5, 6, 7 are provided to support these claims.
Researcher Affiliation	Academia	1California Institute of Technology EMAIL 2MIT CSAIL, CBMM EMAIL
Pseudocode	Yes	Algorithm 1 Connectivity measurement between channels i and j
Open Source Code	Yes	We release our code as well as a pretrained Pop T to enable off-the-shelf improvements in multi-channel intracranial data decoding and interpretability. Code is available at https://github.com/czlwang/Population Transformer.
Open Datasets	Yes	i EEG: We use the publicly available subject data from Wang et al. (2024). EEG: We use the Temple University Hospital EEG and Abnormal datasets, TUEG and TUAB (Obeid & Picone, 2016), for pretraining and task data respectively.
Dataset Splits	Yes	We pretrain for 500,000 steps, and record the validation performance every 1,000 steps. Downstream evaluation takes place on the weights with the best validation performance. We use the intermediate representation at the [CLS] token dh = 512 and put a linear layer that outputs to dout = 1 for fine-tuning on downstream tasks. These parameters for pretraining were the same for any Pop T that needed to be pretrained (across temporal embeddings, hold-one-out subject, ablation studies). ... For all downstream decoding, we use a fixed train/val/test split of 0.8, 0.1, 0.1 of the data.
Hardware Specification	Yes	To run all our experiments (data processing, pretraining, evaluations, interpretability), one only needs 1 NVIDIA Titan RTXs (24GB GPU RAM). Pretraining Pop T takes 2 days on 1 GPU. Our downstream evaluations take a few minutes to run each. For the purposes of data processing and gathering all the results in the paper, we parallelized the experiments on 8 GPUs. ... Table 5: Pop T 1 NVIDIA TITAN RTX (24GB)
Software Dependencies	No	The paper mentions several libraries and optimizers by reference (e.g., LAMB optimizer (You et al., 2019), Adam W optimizer (Loshchilov & Hutter, 2017), MNE-Python (Gramfort et al., 2013), scikit-learn (Pedregosa et al., 2011), Nilearn (Nilearn contributors)), and a scheduler by a GitHub user (ildoonet, 2024), but it does not specify concrete version numbers for these software components as used in their experimental setup.
Experiment Setup	Yes	The core Population Transformer consists of a transformer encoder stack with 6 layers, 8 heads. All layers in the encoder stack are set with the following parameters: dh = 512, H = 8, and pdropout = 0.1. We pretrain the Pop T model with the LAMB optimizer (You et al., 2019) (lr = 5e 4), with a batch size of nbatch = 256, and train/val/test split of 0.89, 0.01, 0.10 of the data. We pretrain for 500,000 steps, and record the validation performance every 1,000 steps. ... For Pop T models, we train with these parameters: Adam W optimizer (Loshchilov & Hutter, 2017), lr = 5e 4 where transformer weights are scaled down by a factor of 10 (lrt = 5e 5), nbatch = 128, a Ramp Up scheduler (ildoonet, 2024) with warmup 0.025 and Step LR gamma 0.95, reducing 100 times within the 2000 total steps that we train for.