reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GraphGPT: Generative Pre-trained Graph Eulerian Transformer

Authors: Qifang Zhao, Weidong Ren, Tianyu Li, Hong Liu, Xingsheng He, Xiaoxiao Xu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on OGB datasets demonstrate Graph GPT s superiority: it achieves SOTA results in graph- and edge-level tasks (e.g., molecular property prediction on PCQM4Mv2 and protein-protein interaction on ogbl-ppa), while delivering competitive performance in node-level tasks. ... 3.5. Ablation Study
Researcher Affiliation	Industry	1Alibaba Inc., Hangzhou, China. Correspondence to: Qifang Zhao <EMAIL>, Xiaoxiao Xu <EMAIL>.
Pseudocode	No	The paper describes the
Open Source Code	Yes	To advance research in graph foundation models and facilitate scientific discovery in chemistry, materials science, and related fields, we have released the source code1 and model checkpoints2. 1https://github.com/alibaba/graph-gpt
Open Datasets	Yes	To demonstrate its versatility across graph tasks, we select benchmarks for graph-, edge-, and node-level objectives: Graph-level: PCQM4Mv2 (quantum chemistry), ogbgmolpcba (molecular property prediction) and Triangles (triangles counting). Edge-level: ogbl-ppa (protein-protein associations) and ogbl-citation2 (citation networks). Node-level: ogbn-proteins (protein interaction networks) and ogbn-arxiv (paper categorization).
Dataset Splits	Yes	To evaluate Graph GPT s ability to learn structural patterns through generative pre-training, we use the Triangles dataset with the task of counting triangles. The dataset is split into: 1). Training/Validation: 30k and 5k small graphs (≤ 25 nodes); 2). Testing: 5k small graphs (Test-small) and 5k large graphs (25–100 nodes, Test-large).
Hardware Specification	Yes	The models are pre-trained and fine-tuned on A800-80G GPU clusters5 using Deep Speed s Stage-2 strategy with mixed precision (FP16/FP32) or BF16 (Rasley et al., 2020). We employ the Adam W optimizer (Loshchilov & Hutter, 2019) with a learning rate scheduler. ... Table 18. Computational cost details of the main datasets in the paper. PT means pre-training and FT stands for fine-tuning. Time is measured in hours. The model size is Base as in Tab. 11 with number of parameters about 110M. The corresponding hyper-parameters can be found in Tab. 13, 14, 15, 16. dataset model size PT time FT time GPU-PT GPU-FT ogbl-ppa B 58.73 h 112.62 h 8 Nvidia L20 16 V100-32G
Software Dependencies	No	The implementation uses Py Torch as the primary framework. For graph preprocessing tasks such as subgraph sampling, we utilize torch-geometric (Fey & Lenssen, 2019). When required, we employ Network X (Hagberg et al., 2008) to Eulerize (sub)graphs and identify (semi-)Eulerian paths. ... We employ a transformer architecture based on Llama (Touvron et al., 2023), implemented via the Hugging Face Transformers library (Wolf et al., 2020).
Experiment Setup	Yes	Table 12. Pre-train and fine-tune configurations for the PCQM4M-v2 dataset. LSI means layer-scale-initialization, EMA is exponential moving average, MPE stands for max-position-embedding, and TWE means tie-word-embeddings. ... Table 13. The Pre-training and fine-tuning configurations for the ogbl-ppa dataset. ... Table 14. Pre-train and fine-tune configurations for the ogbl-citation2 dataset. ... Table 15. Configurations of pre-training with SMTP and fine-tuning for the ogbn-proteins dataset. ... Table 16. Configurations of pre-training with SMTP and fine-tuning for the ogbn-arxiv dataset.