PPT: Patch Order Do Matters In Time Series Pretext Task

Authors: Jaeho Kim, Kwangryeol Park, Sukmin Yun, Seulki Lee

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 EXPERIMENTS This section evaluates PPT as a pretext task in both self-supervised and supervised learning, demonstrating that enhancing patch order awareness improves model performance. In self-supervised scenarios, we employ linear probing, semi-supervised training, and full fine-tuning (Appendix K). In linear probing, the encoder model f is trained, followed by applying logistic regression to the frozen representation. In semi-supervised training, self-supervised training is performed with model f, followed by fine-tuning with a fraction of the dataset and a reduced learning rate (Dong et al., 2024). In the supervised context, we perform an ablation study on the consistency and contrastive learning methods, revealing their individual and combined effects on model performance.
Researcher Affiliation Academia 1Ulsan National Institute of Science and Technology (UNIST) 2Hanyang University ERICA EMAIL EMAIL
Pseudocode Yes Algorithm 1 Pseudocode of Channel-Independent Patch Permutation in Py Torch style.
Open Source Code Yes 0https://github.com/eai-lab/PPT
Open Datasets Yes Datasets. (1) EMOPain (Egede et al., 2020), from the UEA repository (Bagnall et al., 2018), is a pain classification task using a surface electromyographic sensor (s EMG), with 30 channels (2) PTB (Bousseljot et al., 1995) is cardiogram signals (ECG), incorporating 15 channels from 198 users. (3) Gilon (GL) is a large-scale smart insole-based human activity recognition (HAR) task (Kim et al., 2023) with 47,647 instances with 14 channels collected from 72 users. (4) Sleep EEG (Kemp et al., 2000) is a univariate brainwave (EEG) with 371,055 instances. (5) MS-HAR (Morris et al., 2014) has six sensors in an armband with 14,201 instances collected from 93 users.
Dataset Splits Yes Gilon HAR (Kim et al., 2023). ... We used the original five-fold split given in the original paper. PTB (ECG) (Bousseljot et al., 1995). ... We adopted the same data splits as specified in their study. Setup. ... fine-tuning all parameters using cross-entropy loss (Wang et al., 2024b) with limited labeled training data: 10%, and 1%.
Hardware Specification Yes All experiments were conducted on two types of GPU: NVIDIA RTX 3090-24GB and NVIDIA RTX A6000-48GB.
Software Dependencies Yes We used the Py Torch 2.0.1 and Py Torch Lightning 2.1.2 frameworks.
Experiment Setup Yes Implementation of PPT. We apply PPT to two representative patch-based models: Transformerbased Patch TST (Nie et al., 2022) and linear-based PITS (Lee et al., 2023). We set permutation frequency γ = 1, 2, 3 for weak shuffle and γ = 10 or 40 for strong shuffle. As each dataset differs in length, we adjust the patch size s so that each sequence is divided into 10 to 40 non-overlapping patches. The detailed implementations and configurations are in Appendix D and Appendix F. F.5 HYPERPARAMETERS FOR SUPERVISED LEARNING The hyperparameter configurations for supervised training are in Tab. 7. F.6 HYPERPARAMETERS FOR SELF-SUPERVISED LEARNING The hyperparameter configuration for self-supervised linear probing is in Tab. 8.