TGB-Seq Benchmark: Challenging Temporal GNNs with Complex Sequential Dynamics

Authors: Lu Yi, Jie Peng, Yanping Zheng, Fengran Mo, Zhewei Wei, Yuhang Ye, Yue Zixuan, Zengfeng Huang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Benchmarking experiments reveal that current methods usually suffer significant performance degradation and incur substantial training costs on TGB-Seq, posing new challenges and opportunities for future research. Comprehensive evaluations on TGB-Seq reveal that existing temporal GNNs experience substantial performance declines compared to their impressive results on existing benchmarks.
Researcher Affiliation Collaboration 1Renmin University of China, 2Universit e de Montr eal EMAIL EMAIL Yuhang Ye3, Zixuan Yue3 3Huawei Poisson Lab, Huawei Technology Ltd. EMAIL Zengfeng Huang4 4Fudan University, Shanghai Innovation Institute EMAIL
Pseudocode No The paper describes the methodology using textual explanations and mathematical equations, such as for the memory module and aggregation module, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes TGB-Seq datasets, leaderboards, and example codes are available at https://tgb-seq.github.io/. All code is publicly available on the TGB-Seq Git Hub repository. We provide a Python package available via pip, enabling seamless dataset downloading, negative sample generation, and evaluation. All code is publicly available on the TGB-Seq Git Hub repository.
Open Datasets Yes TGB-Seq comprises large real-world datasets spanning diverse domains, including e-commerce interactions, movie ratings, business reviews, social networks, citation networks and web link networks. TGB-Seq datasets, leaderboards, and example codes are available at https://tgb-seq.github.io/. The TGB-Seq datasets are available at Hugging Face: https://huggingface.co/ TGB-Seq.
Dataset Splits Yes We split the datasets chronologically into training, validation, and test sets with a ratio of 70%/15%/15%.
Hardware Specification Yes For the ML-20M and the Flickr datasets, experiments are conducted on an Ubuntu machine equipped with Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz. The GPU device is NVIDIA A100 with 80 GB memory. For the Taobao dataset, experiments are conducted on an Ubuntu machine equipped with Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz. The GPU device is NVIDIA A100-SXM4 with 80 GB memory. For the Yelp dataset, experiments are conducted on an Ubuntu machine equipped with Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz. The GPU device is NVIDIA RTX A6000 with 40 GB memory. For the Google Local, the Patent, and the Wiki Link datasets, experiments are conducted on an Ubuntu machine equipped with Hygon C86 7390 32-core Processor. The GPU device is NVIDIA A800 with 80 GB memory. For the You Tube dataset, experiments are conducted on an Ubuntu machine equipped with Intel(R) Xeon(R) Platinum 8369B CPU @ 2.90GHz. The GPU device is A100-SXM4 with 80 GB memory.
Software Dependencies No The paper mentions using the 'Dy GLib Yu et al. (2023) framework' and 'SBERT' without providing specific version numbers for either. No other software dependencies are listed with version details.
Experiment Setup Yes We set the batch size to 200 for the Google Local dataset across all methods... we increase the batch size to 400 for all other datasets... Following Dy GFormer, we use a learning rate of 0.0001 across all methods and datasets. A grid search is performed to tune the hyper-parameters of each method on the validation set. Detailed configurations are provided in Appendix D.1.