Directed Graph Transformers
Authors: Qitong Wang, Georgios Kollias, Vasileios Kalantzis, Naoki Abe, Mohammed J Zaki
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on synthetic and real graph datasets show that our approach can have significant accuracy gains over previous graph transformer (GT) and graph neural network (GNN) approaches, providing state-of-the-art (SOTA) results on inherently directed graphs. |
| Researcher Affiliation | Collaboration | Qitong Wang EMAIL Rensselaer Polytechnic Institute Georgios Kollias EMAIL IBM Research Vasileios Kalantzis EMAIL IBM Research Naoki Abe EMAIL IBM Research Mohammed J. Zaki EMAIL Rensselaer Polytechnic Institute |
| Pseudocode | No | The paper describes the architecture and mechanisms using mathematical notation and descriptive text, but it does not include a clearly labeled pseudocode block or algorithm steps formatted like code. |
| Open Source Code | Yes | Our source code is available via github: https://github.com/Qitong-Wang/Directed-Graph-Transformers. |
| Open Datasets | Yes | There are several directed datasets used in previous studies, such as MNIST (Le Cun & Cortes, 2005), CIFAR10 (Krizhevsky et al., 2009), Ogbg-Code2 (Hu et al., 2020), and Malnet-tiny (Freitas et al., 2020). Twitter datasets: We use 973 directed ego-networks from Twitter2, each corresponding to some user u (ego): the ego-network is between u s friends also referred to as alters (Leskovec & Mcauley, 2012). If nodes vi, vj are in u s ego-network then u follows them and if vi follows vj then there is a directed edge vi 7 vj in the ego-network. [...] 2https://snap.stanford.edu/data/ego-Twitter.html |
| Dataset Splits | Yes | Malnet-tiny and Malnet-sub: Malnet-tiny provides a graph classification task for five different types of malicious software. It contains 5,000 graphs, and each graph contains less than 5,000 nodes. Its graph size is an obstacle for many graph transformers. We apply a filter to generate the Malnet-sub dataset: we choose the graphs with fewer than 500 nodes for training sets, and those with fewer than 2,000 nodes for both validation and test datasets. |
| Hardware Specification | Yes | Our experiments were performed on NVIDIA V100 GPUs, with 32GB memory, using Py Torch. We test our code on a node with NVIDIA V100 GPUs (32GB RAM), 20-core 2.5Ghz Intel Xeon CPU (768GB RAM), running Linux. |
| Software Dependencies | No | Our experiments were performed on NVIDIA V100 GPUs, with 32GB memory, using Py Torch. We use Python and specifically the Py Torch library for our implementation. However, specific version numbers for PyTorch or Python are not provided. |
| Experiment Setup | Yes | We fix the batch size to 32, the number of maximum epochs to 200, and we employ grid search for tuning the learning rate η {2i u | i = 0, 1, 2, 3, 4} with u = 5 10 4, choosing η = 5 10 4 (or i = 1) for EGT and Di GT models on MNIST and CIFAR10 datasets, and η = 8 10 3 (or i = 4) for all other datasets. We employ a grid search for tuning the number of hops k {1, 3, 5, 10, 25}; we also employ a grid search for normalization [no normalization, batch normalization, layer normalization]. We apply a similar process for other models. Table 8 contains parameters used for Di GT. |