Graph Generative Pre-trained Transformer

Authors: Xiaohui Chen, Yinkai Wang, Jiaxing He, Yuanqi Du, Soha Hassoun, Xiaolin Xu, Liping Liu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on multiple datasets demonstrate G2PT s superior performance in both generic graph generation and molecular generation. Additionally, the good experiment results also show that G2PT can be effectively applied to goal-oriented molecular design and graph representation learning. [...] We evaluate G2PT on a series of graph generation tasks: generic graph generation, molecule generation, and goaloriented molecular generation.
Researcher Affiliation Academia 1Tufts University 2Northeastern University 3Cornell University. Correspondence to: Xiaohui Chen <EMAIL>, Li-Ping Liu <EMAIL>.
Pseudocode Yes Algorithm 1 Degree-Based Edge Removal Process Algorithm 2 RFT Dataset Construction Algorithm 3 SBSk combined with RFT Algorithm 4 Depth-First search edge order generation Algorithm 5 Breadth-First Search edge order generation Algorithm 6 Uniform edge order genration
Open Source Code Yes The code of G2PT is released at https://github.com/tuftsml/G2PT.
Open Datasets Yes We use three molecule datasets: QM9 (Wu et al., 2018b), MOSES (Polykovskiy et al., 2020), and Guaca Mol (Brown et al., 2019); and four generic graph datasets: Planar, Tree, Lobster, and stochastic block model (SBM), which are widely used to benchmark graph generative models. In predictive tasks, we fine-tune models pre-trained from Guaca Mol datasets on molecular properties with the benchmark method Molecule Net (Wu et al., 2018a).
Dataset Splits Yes We divide generic datasets into training, validation, and test sets based on splitting ratios 6:2:2. For the molecular datasets, we follow the default settings of the datasets. [...] We adopt the scaffold split that divides train, validation and test set by different scaffolds, introduced by Wu et al. (2018b).
Hardware Specification Yes We ran all pre-training tasks and all goal-oriented generation fine-tuning tasks run on 8 NVIDIA A100-SXM4-80GB GPU with distributed training. For PPO training and graph property prediction tasks, we ran experiments using a A100 GPU.
Software Dependencies No The paper mentions the use of RDKit and the Therapeutics Data Commons (TDC) package, and the AdamW optimizer, but does not provide specific version numbers for these software components or any other libraries.
Experiment Setup Yes Table 7. Hyperparameters for graph generative pre-training. (includes #layers, #heads, dmodel, dropout rate, Lr, Optimizer, Lr scheduler, Weight decay, #iterations, Batch size, #Gradient Accumulation, Grad Clipping Value, #Warmup Iterations). Table 8. Hyperparameters used for PPO training. (includes γ, λ, ρ1, ρ2, ρ3, Advantage Normalization and Clipping, Reward Normalization and Clipping, Ratio Clipping (ϵ), Critic Value Clipping, Entropy Regularization, Gradient Clipping Value, Actor Lr, Critic Lr, #Iterations, Batch size).