reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Motif-aware Graph Neural Networks for Networked Time Series Imputation

Authors: Nourhan Ahmed, Vijaya Krishna Yalavarthi, Lars Schmidt-Thieme

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that when compared to state-of-the-art models for time-series imputation tasks, our proposed model can reduce the error by around 19%. Experiment. Empirical results show that Motif-GNN outperforms baseline models by approximately 19% for time series imputation across various real-world datasets. Evaluation is conducted using Mean Absolute Error (MAE) and Mean Relative Error (MRE) as metrics.
Researcher Affiliation	Collaboration	1 Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany 2 VWFS Data Analytics Research Center EMAIL
Pseudocode	Yes	Algorithm 1: The Framework of Motif-GNN.
Open Source Code	No	No information about open-source code is provided in the paper. There are no links to repositories or explicit statements about code release.
Open Datasets	Yes	The air quality dataset comprises observations from 36 monitoring sites (Yi et al. 2016), collected between May 1, 2014, and April 30, 2015, totaling 8,759 timestamps (Yi et al. 2016; Zheng et al. 2015). Traffic Datasets. The traffic datasets used in this paper are Pe MS-LA, Pe MS-BA, and Pe MS-SD, collected from January 1 to March 30, 2022, from highways in Los Angeles County, the Bay Area, and San Diego, respectively (Chen et al. 2001).
Dataset Splits	Yes	Under the MCAR scenario, datasets are split into 70% for training, 10% for validation, and 20% for testing. In the block-missing scenario, we simulate overlapping time- and space-structured (spatio-temporal) missingness by randomly dropping 5% of the data per sensor and simulating failures with a 0.15% probability, varying the duration from 1 to 4 hours for traffic data and 2 to 6 days for air quality data, following the evaluation protocol in (Cini, Marisca, and Alippi 2022). Additionally, to ensure consistency with baseline models, we consider time-structured missingness in the air quality dataset by partitioning the 1-year data into two parts, using the months of March, June, September, and December as the test set and the rest for training (Yi et al. 2016; Cini, Marisca, and Alippi 2022; Wang et al. 2023a). For traffic datasets, we used the first half of February and the last half of March for testing and the rest for training.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers, such as programming language versions or library versions (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	No	The paper does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or optimizer settings in the main text.