Attentive Walk-Aggregating Graph Neural Networks

Authors: Mehmet F Demirel, Shengchao Liu, Siddhant Garg, Zhenmei Shi, Yingyu Liang

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose a novel GNN model, called AWARE, that aggregates information about the walks in the graph using attention schemes. This leads to an end-to-end supervised learning method for graph-level prediction tasks... We then perform theoretical, empirical, and interpretability analyses of AWARE. Our theoretical analysis in a simplified setting identifies successful conditions for provable guarantees... Our experiments demonstrate the strong performance of AWARE in graph-level prediction tasks... Lastly, our interpretation study illustrates that AWARE can successfully capture the important substructures of the input graph.
Researcher Affiliation Collaboration Mehmet F. Demirel EMAIL Department of Computer Sciences University of Wisconsin-Madison Shengchao Liu EMAIL Quebec AI Institute (Mila) Siddhant Garg EMAIL Amazon Alexa AI Zhenmei Shi EMAIL Department of Computer Sciences University of Wisconsin-Madison Yingyu Liang EMAIL Department of Computer Sciences University of Wisconsin-Madison
Pseudocode Yes Algorithm 1 AWARE (W, Wv, Ww, Wg) Require: Graph G=(V, A), max walk length T 1: Compute vertex embeddings F by Eqn (1) 2: F(1) = σ(Wv F) 3: for each n [2, T] do 4: Compute Sn using Eqn (7) 5: F(n)= F(n 1)(A Sn) F(1) 6: end for 7: Set f(n) := σ(Wg F(n))1 for 1 n T 8: Set f[T ](G) := [f(1); . . . ; f(T )] Ensure: The graph embedding f[T ](G)
Open Source Code Yes The code is available on Git Hub. Our code can be accessed at https://github.com/mehmetfdemirel/aware
Open Datasets Yes Datasets. We perform experiments on graph-level prediction tasks from two domains: molecular property prediction (61 tasks from 11 benchmarks) and social networks (4 benchmarks)... Table 1 provides details about the datasets used in our experiments. Specifically: IMDB-BINARY (Yanardag & Vishwanathan, 2015), Tox21 (Tox21 Data Challenge, 2014), Mutagenicity (Kazius et al., 2005), etc.
Dataset Splits Yes Each dataset is randomly split into training, validation, and test sets with a ratio of 8:1:1, respectively. We report the average performance across 5 runs (datasets are split independently for each run).
Hardware Specification Yes GPU Specifications. In general, an NVIDIA GeForce GTX 1080 (8GB) GPU model was used in the training process to obtain the main experimental results. For some of the bigger datasets, we used an NVIDIA A100 (40 GB) GPU model.
Software Dependencies No The paper does not explicitly provide specific software dependencies with version numbers for the implementation of AWARE. It mentions using 'XGBoost (Chen & Guestrin, 2016)' for baselines, but without a version number, and this is not part of their own proposed model's software stack.
Experiment Setup Yes Hyperparameter Tuning. For AWARE, we carefully perform a hyperparameter sweeping on the different candidate values listed in Table 2. Learning rate 1e-3, 1e-4 # of linear layers in the predictor: L 1, 2, 3 Maximum walk length: T 3, 6, 9, 12 Vertex embedding dimension: r 100, 300, 500 Optimizer Adam. We train the model for 500 epochs and use early stopping on the validation set with a patience of 50 epochs. No learning rate scheduler is used.