reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Topological Zigzag Spaghetti for Diffusion-based Generation and Prediction on Graphs

Authors: Yuzhou Chen, Yulia Gel

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS Datasets, Baselines, and Experimental Setup. We evaluate our ZS-based graph learning models on spatio-temporal prediction tasks on 2 traffic datasets, i.e., Pe MSD3 and Pe MSD8 which are real-time traffic datasets from California (Guo et al., 2019; Song et al., 2020a), where nodes denote sensors and edges represent the intersections between the nodes. ... Findings. Table 1 compares forecasting performance on spatio-temporal graphs between our ZS-DM and state-of-the-art baselines (6 probabilistic methods) on Pe MSD3 and Pe MSD8. We find that ZS-DM leads performance under all scenarios, with gains up to 14% in MAPE and up to 7.5% in RMSE.
Researcher Affiliation	Academia	Yuzhou Chen1 Yulia R. Gel2 1Department of Statistics, University of California, Riverside 2Department of Statistics, Virginia Tech EMAIL EMAIL
Pseudocode	No	The paper describes methods using mathematical formulations and descriptive text, but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code and data can be accessed under https://github.com/zigzagspaghetti/zigzag_ spaghetti_dm.git.
Open Datasets	Yes	We evaluate our ZS-based graph learning models on spatio-temporal prediction tasks on 2 traffic datasets, i.e., Pe MSD3 and Pe MSD8 which are real-time traffic datasets from California (Guo et al., 2019; Song et al., 2020a)... In addition, we validate our ZS-DM on unsupervised graph representation learning classification tasks using the following 8 real-world chemical compounds and protein molecules: (i) 3 chemical compound datasets: MUTAG, BZR, and COX2... and (ii) 5 molecular compound datasets: NCI1, D&D, PROTEINS, PTC_MR, and PTC_FM... We also adopt a larger graph dataset ogbg-molhiv from Open Graph Bench Mark (OGB) Hu et al. (2020).
Dataset Splits	Yes	For these 8 graphs, we follow the training principle (You et al., 2020) and use 10-fold cross validation accuracy as the classification performance (based on a non-linear SVM model, i.e., LIB-SVM Chang & Lin (2011)) and report the mean and standard deviation. ... For ogbg-molhiv data, the task is to predict a certain molecular property, measured in terms of Receiver Operating Characteristic Area Under Curve (ROC-AUC) scores, and we follow the official scaffold splitting where structurally different molecules are separated into different subsets. ... We also perform experiments on reducing the training set from 90% to 80% and 70% (see Appendix D for further details).
Hardware Specification	Yes	We implement our proposed ZS-DM with Pytorch framework on two NVIDIA RTX A5000 GPUs with 24 GB RAM.
Software Dependencies	No	The paper mentions 'Pytorch framework' and 'dionysus2 package in Python' but does not provide specific version numbers for these software components.
Experiment Setup	Yes	For spatio-temporal traffic forecasting, we set the number of epochs trained, batch size, and weight decay ratio to be 200, 16, and 0.9 respectively. We search initial learning rate of model among {0.001, 0.003, 0.005, 0.01}, number of convolution layers among {1, 2, 3}, dimension of hidden units among {16, 32, 64, 128, 512}. For the mixed-up graph construction, we set number of steps of a random work to be τ = 2.