Fitting Autoregressive Graph Generative Models through Maximum Likelihood Estimation
Authors: Xu Han, Xiaohui Chen, Francisco J. R. Ruiz, Li-Ping Liu
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate empirically that fitting autoregressive graph models via variational inference improves their qualitative and quantitative performance, and the improved model and inference network further boost the performance. We also conduct extensive experiments where we show the benefits of the generative model and the approximate posterior over the approach of Chen et al. (2021). In Section 7, we analyze the empirical performance of graph generative models trained with VI. |
| Researcher Affiliation | Collaboration | Xu Han EMAIL Department of Computer Science Tufts University Medford, MA 02155, USA; Xiaohui Chen EMAIL Department of Computer Science Tufts University Medford, MA 02155, USA; Francisco J. R. Ruiz EMAIL Deep Mind 5 New Street, London, UK; Li-Ping Liu EMAIL Department of Computer Science Tufts University Medford, MA 02155, USA |
| Pseudocode | Yes | Algorithm 1 Autoregressive generation of adjacency matrices Algorithm 2 VI algorithm for training a graph model based on the adjacency matrix A |
| Open Source Code | Yes | The implementation of the proposed model is publicly available at https://github.com/tufts-ml/Graph-Generation-MLE. |
| Open Datasets | Yes | We use 8 datasets that are commonly used for benchmarking graph generative models: (1) Community-small: ... (2) Citeseer-small: ... (3) Enzymes: ... (4) Lung: ... (5) Yeast: ... (6) Cora: ... (7) SBM-assortative: ... (8) MMSBM: ... Graphs in Lung and Yeast datasets reprent structures of chemical compounds. Graphs in the Enzymes dataset represent protein tertiary structures. While there is only one single graph in the Citeseer or Cora datasets, we sample subgraphs via random walk to form the corresponding datasets. |
| Dataset Splits | Yes | We split all datasets into three parts: the train set (80%), validation set (10%), and test set (10%). |
| Hardware Specification | Yes | Both methods run on an RTX 3080 GPU. |
| Software Dependencies | Yes | For ROS-VI and Rout-VI, we use the Nauty package (Mc Kay and Piperno, 2013) to compute |Π(A)| (see Section 4). |
| Experiment Setup | Yes | We compare DAGG against three recent graph generative models: Graph DF (Luo et al., 2021), Graph RNN (You et al., 2018), and Graph GEN (Goyal et al., 2020). We use their original training methods with default hyperparameters. ... For each model, we use L = 1,000 samples to estimate the test log-likelihood via importance sampling (Eq. 13), using the variational distribution qφ(π | G) as the proposal. ... In our experiments, we found that L = 1,000 gives an accurate estimation (see Figure 5). |