reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ST-FiT: Inductive Spatial-Temporal Forecasting with Limited Training Data

Authors: Zhenyu Lei, Yushun Dong, Jundong Li, Chen Chen

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments verify the effectiveness of ST-Fi T in multiple key perspectives. Experimental Evaluations In this section, we aim to answer the following research questions. RQ1. How well can ST-Fi T generalize to nodes with no available temporal data for training, compared to other existing alternatives? RQ2. How does the performance tendency of ST-Fi T look like compared with other baselines when these models are trained on varying ratios of nodes with available temporal data? RQ3. How does each module of ST-Fi T contribute to the overall performance? RQ4. How does the choice of hyper-parameters influence the performance of ST-Fi T? In the following sections, we first present the experimental settings, followed by the answers to the proposed research questions.
Researcher Affiliation	Academia	Zhenyu Lei1, Yushun Dong2, Jundong Li1, Chen Chen1 1University of Virginia 2Florida State University EMAIL, EMAIL
Pseudocode	No	The paper mentions "We present the complete algorithmic routine in Appendix." which suggests pseudocode is available, but it is not provided within the main body of the text analyzed.
Open Source Code	Yes	Code https://github.com/Lzy Fischer/Inductive ST Extended version https://arxiv.org/abs/2412.10912
Open Datasets	Yes	Datasets. Following previous works (Li et al. 2023), we conduct experiments on three most commonly used real-world datasets PEMS03, PEMS04, and PEMS08, which are all public transport network datasets released by Caltrans Performance Measurement System (Pe MS) (Pe MS 2021).
Dataset Splits	Yes	Task Settings. For a fair comparison, we follow the dataset division along temporal dimensions in previous works (Jiang et al. 2023a), where datasets are split as 70% training, 20% validation and 10% inference in chronological order.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions various models and frameworks (e.g., STGNNs, VAE, Gumbel-Softmax, FC-LSTM, STGCN, STGODE, Trans GTR) but does not provide specific version numbers for any software libraries or tools used.
Experiment Setup	Yes	Following previous works, we generate training samples through a sliding window of 24 time steps, with the first 12 as model input, and the remaining 12 as ground truth for forecasting outcomes. Accordingly, we compare the average performance on the MAE, RMSE, and MAPE metrics. All experiments have been repeated with 3 different random seeds. For the value of λ, we choose it from the range between 0 and 0.5. With above experiments, we recommend using λ as 0.5, ϵ as 0.9 for optimal performance.