reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LLM-TS Integrator: Integrating LLM for Enhanced Time Series Modeling

Authors: Can Chen, Gabriel L. Oliveira, Hossein Sharifi-Noghabi, Tristan Sylvain

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across five mainstream TS tasks short-term and long-term forecasting, imputation, classification, and anomaly detection demonstrate the effectiveness of our framework. Obtained experimental results align across state-of-the-art of traditional time series and various LLMs which demonstrates applicability of LLM-TS Integrator framework regardless of choice of methods.
Researcher Affiliation	Collaboration	Can (Sam) Chen EMAIL Mila Quebec AI Institute Gabriel L. Oliveira EMAIL RBC Borealis Hossein Sharifi-Noghabi EMAIL RBC Borealis Tristan Sylvain EMAIL RBC Borealis
Pseudocode	Yes	Algorithm 1 LLM-TS Integrator Input: The TS dataset D, number of training iterations T. Output: Trained TS model parameterized by θ . 1: /* Mutual Information Module / 2: Train a traditional TS model (e.g., Times Net) parameterized by θ using D. 3: Generate text description t for TS sample x via a designed template. 4: Derive hidden representations hm θ (x) from the TS model and hl(t) from the LLM. 5: while τ <= T 1 do 6: Sample x, t, y from D, where y are the labels. 7: Optimize a discriminator model Tβ to estimate mutual information as per Eq .(2). 8: / Sample Reweighting Module */ 9: Process sample loss l O with the weighting net to produce dual weights as per Eq. (3), (4). 10: Adopt bi-level optimization to update the weighting net following Eq. (6), (7). 11: Re-calculate dual weights using the updated weighting net per Eq. (3), (4). 12: Calculate the overall loss to update the TS model as per Eq. (5). 13: end while 14: Return the trained TS model parameterized by θ .
Open Source Code	Yes	Our code is available at: https://github.com/ Borealis AI/LLM-TS-Integrator
Open Datasets	Yes	In the realm of short-term forecasting, we utilize the M4 dataset (Spyros Makridakis, 2018), which aggregates univariate marketing data on a yearly, quarterly, and monthly basis. For long-term forecasting, we examine five datasets following (Zhou et al., 2023): ETT (Zhou et al., 2021a), Electricity (UCI, 2015), Traffic (Pe MS, 2024), Weather (Wetterstation, 2024), and ILI (CDC, 2024). ... Specifically, we employ 10 diverse multivariate datasets sourced from the UEA Time Series Classification repository (Bagnall et al., 2018). ... We benchmark our method against five established anomaly detection datasets: SMD (Su et al., 2019), MSL (Hundman et al., 2018), SMAP (Hundman et al., 2018), SWa T (Mathur and Tippenhauer, 2016), and PSM (Abdulaal et al., 2021).
Dataset Splits	Yes	In the realm of short-term forecasting, we utilize the M4 dataset (Spyros Makridakis, 2018), which aggregates univariate marketing data on a yearly, quarterly, and monthly basis. For long-term forecasting, we examine five datasets following (Zhou et al., 2023): ETT (Zhou et al., 2021a), Electricity (UCI, 2015), Traffic (Pe MS, 2024), Weather (Wetterstation, 2024), and ILI (CDC, 2024). We adhere to the Times Net setting with an input length of 96. ... To simulate various degrees of missing data, we randomly obscure time points at proportions of {12.5%, 25%, 37.5%, 50%} following (Wu et al., 2023).
Hardware Specification	Yes	We detail the time cost of each component for ETTh1 and ETTm1 tasks, using a batch size of 32 on a 32G V100 GPU.
Software Dependencies	No	The paper mentions LLMs like LLa MA-3b, GPT2, and BERT models were used, but does not provide specific version numbers for programming languages, libraries (e.g., PyTorch, TensorFlow), or other software tools crucial for replication.
Experiment Setup	Yes	Following Shu et al. (2019), the weighting network comprises a two-layer MLP with a hidden size of 100, and we set the learning rate η2 for this network at 0.001. The learning rate η0 of the discriminator is set as 0.001 at the first epoch and then decreases to 0.0001 for the rest of epochs.