reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

NuTime: Numerically Multi-Scaled Embedding for Large- Scale Time-Series Pretraining

Authors: Chenguo Lin, Xumeng Wen, Wei Cao, Congrui Huang, Jiang Bian, Stephen Lin, Zhirong Wu

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study its transfer performance on a number of univariate and multivariate classiﬁcation tasks, few shot learning, unsupervised clustering and anomaly detection benchmarks. Our method exhibits remarkable improvement against previous pretraining approaches and establishes the new state of the art, even compared with domain-speciﬁc non-learning-based methods.
Researcher Affiliation	Collaboration	Chenguo Linú EMAIL Peking University Xumeng Wen EMAIL Microsoft Corporation Wei Cao EMAIL Microsoft Corporation Congrui Huang EMAIL Microsoft Corporation Jiang Bian EMAIL Microsoft Corporation Stephen Lin EMAIL Microsoft Corporation Zhirong Wu EMAIL Microsoft Corporation
Pseudocode	No	The paper describes methods and equations, such as for the Numerically Multi-scaled Embedding, but does not contain a dedicated pseudocode or algorithm block with structured steps.
Open Source Code	Yes	Code is available at: https://github.com/chenguolin/Nu Time.
Open Datasets	Yes	To conduct large-scale representation learning, we collect pretraining data by combining existing datasets from various sources, yielding a dataset with over one million time-series sequences. ... (1) The UCR time series archive (Dau et al., 2019), (2) The UEA benchmark (Bagnall et al., 2018) and (3) eight additional datasets used in recent technical papers (Eldele et al., 2021b; Zhang et al., 2022) include: Epilepsy (Andrzejak et al., 2001), Sleep EEG (Kemp et al., 2000), HAR (Anguita et al., 2013), Gesture (Liu et al., 2009), FD-A (Lessmeier et al., 2016), FD-B (Lessmeier et al., 2016), ECG (Cliﬀord et al., 2017) and EMG (Goldberger et al., 2000).
Dataset Splits	Yes	The original training and testing splits of these datasets are retained, and only the training portions are merged. ... For a fair comparison, we adopt the same training and test split as Zhang et al. (2022), and there are 60 and 13,559 samples in FD-B training and test dataset for classiﬁcation benchmarking. ... Epilepsy (Andrzejak et al., 2001) ... we use the dataset split by Zhang et al. (2022), having 60 samples for training, 20 samples for validation, and 11,420 samples for testing.
Hardware Specification	Yes	The pretraining takes 6 hours on 4 V100 GPUs.
Software Dependencies	No	The paper mentions using a Transformer encoder and AdamW optimizer, but does not provide specific version numbers for any software libraries or dependencies (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	We adopt a 6-layer and 8-head standard Transformer encoder with ﬁxed sinusoidal positional encoding (Vaswani et al., 2017) as the backbone for our experiments. It uses 128-dimensional latent vectors through all of its layers, with 512 dimensions for the MLP hidden layer size. The window size for input patches is 16. For the numerically multi-scaled embedding, we choose to use 9 scales, which range from 10 4 to 104 by factors of 10. ... The learning rate is 2e-3 for a batch size of 2048. The model is trained for a total of 100 epochs with a linear learning rate warm-up in the ﬁrst 10 epochs of training and a cosine learning rate decay scheduler (Loshchilov & Hutter, 2017) with an end rate of zero. For optimization, we use Adam W (Loshchilov & Hutter, 2018) with 1 = 0.9, 2 = 0.999 and a weight decay of 0.05. For pretraining, we simply choose the data augmentation of random resized crop for the BYOL objective. It randomly crops a sub-sequence from the original data between the range of 80% to 100%, and subsequently resizes the selected sub-sequence to a length of 512 using bilinear interpolation.