Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings

Authors: Di Wu, Siyuan Li, Chen Feng, Lu Cao, Yue Zhang, Jie Yang, Mohamad Sawan

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that H2Di LR, as a unified decoding paradigm, significantly outperforms the conventional heterogeneous decoding approach. Furthermore, we empirically confirm that H2Di LR effectively captures both homogeneity and heterogeneity during neural representation learning. We comprehensively evaluate the effectiveness of H2Di LR using stereoelectroencephalography (s EEG) data collected from multiple participants reading Mandarin materials comprising 407 syllables.
Researcher Affiliation Academia 1Westlake University, Hangzhou, China 2Zhejiang University, Hangzhou, China Corrsponding Authors at EMAIL and EMAIL.
Pseudocode No The paper describes methods using mathematical formulations and architectural diagrams (e.g., Figure 2, Figure 3) but does not include any clearly labeled pseudocode or algorithm blocks. For instance, the H2D quantization is formulated mathematically in Eq. (4) and the learning objective in Eq. (8), but these are not presented as pseudocode.
Open Source Code No The paper mentions open-source code for baselines: "For all baselines (Woo et al., 2022; Eldele et al., 2021; Wu et al., 2022) with open-source code, we reproduce results using the official code and setups provided by the authors." However, it does not provide any explicit statement or link for the code of the proposed H2Di LR methodology.
Open Datasets No For the main lexical tone decoding task, the paper states: "To evaluate H2Di LR, we collected stereoelectroencephalography (s EEG) data from multiple participants reading Mandarin materials comprising 407 syllables..." While an ablation study mentions "The publicly available CHB-MIT (Shoeb & Guttag, 2010) dataset," this dataset is used for a different task (epilepsy seizure prediction) and not the primary focus of the paper. There is no concrete access information for the Mandarin sEEG dataset collected by the authors.
Dataset Splits Yes Data for each participant is divided into an 80% training set and a 20% testing set, with 20% of the training data further allocated for validation.
Hardware Specification Yes All our experiments are implemented by Py Torch and conducted on workstations with NVIDIA A100 GPUs.
Software Dependencies No The paper mentions "Py Torch" as the implementation framework and "Adam W" as the optimizer but does not provide specific version numbers for these or any other software libraries or tools used.
Experiment Setup Yes For all baselines with no pre-training involved, we train each model with a fixed epoch number of 40. We use Adam W as the optimizer with a base learning rate of 3e-4. Cosine annealing decay is adopted for learning rate scheduling. We set momentum factors β1, β2 = 0.9, 0.999 with a weight decay of 0.01. The batch size is set to 32. A dropout ratio of 0.1 is adopted. For the pre-training stage of baselines with pre-training... we use Adam W (Loshchilov & Hutter, 2019) optimizer with a base learning rate of 5e-5, β1, β2 = 0.9, 0.999, and the weight decay of 0.01. A cosine learning rate schedule is also adopted. We train a fixed number of 1000 epochs during pre-training with batch size 32. For UPa NT and H2D, we use β = 0.25 and ν = 0.5 with a total codebook size of 256, i.e., K = 256 for UPa NT only and K = 32 for H2D. For the fine-tuning stage of all approaches with pre-training, we adopt the same training setup as previously described for baselines with no pre-training, with the only difference being a learning rate of 5e-5.