GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?

Authors: Mufei Li, Eleonora Kreacic, Vamsi K. Potluru, Pan Li

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive studies on real-world networks with up to more than 13K nodes, 490K edges, and 1K attributes demonstrate that Graph Maker overall significantly outperforms the baselines in producing graphs with realistic properties and high utility for graph ML model development. For evaluation on graph ML tasks, Graph Maker achieves the best performance for 80% of cases across all datasets. For property evaluation, Graph Maker achieves the best performance for 50% of cases.
Researcher Affiliation Collaboration Mufei Li EMAIL School of Electrical and Computer Engineering Georgia Institute of Technology Eleonora Kreačić EMAIL J.P. Morgan AI Research
Pseudocode No The paper describes the methodology, including the forward diffusion process, reverse process, and network architecture, in detail through text and mathematical equations, but it does not present any formal pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code for Graph Maker, nor does it provide a link to a code repository.
Open Datasets Yes Datasets: We utilize three large attributed networks for evaluation. Cora is a citation network depicting citation relationships among papers (Sen et al., 2008)... Amazon Photo and Amazon Computer are product co-purchase networks (Shchur et al., 2018).
Dataset Splits Yes Node classification. We randomly split a generated dataset so that the number of labeled nodes in each class and each subset is the same as that in the original dataset. For discriminative models, we choose three representative GNNs SGC (Wu et al., 2019), GCN (Kipf & Welling, 2017), and APPNP (Gasteiger et al., 2019). ... Link prediction. To prevent label leakage from dataset splitting, we split the edges corresponding to the upper triangular adjacency matrix into different subsets and then add reverse edges after the split.
Hardware Specification Yes Empirically, we find that with a 16-GB GPU, we can use at most a single graph transformer layer... Graph Maker is the only diffusion model that does not encounter the out-of-memory (OOM) error on a 48-GB GPU.
Software Dependencies No We implement our work based on Py Torch (Paszke et al., 2019) and DGL (Wang et al., 2019). While the software frameworks are mentioned with their publication years, specific version numbers are not provided.
Experiment Setup Yes Table 10: Hyperparameters for Graph Maker-Sync. Table 11: Hyperparameters for Graph Maker-Async.