GraphPrivatizer: Improved Structural Differential Privacy for Graph Neural Networks

Authors: Rucha Bhalchandra Joshi, Patrick Indri, Subhankar Mishra

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on real-world graph datasets and empirically evaluate the privacy of our models against privacy attacks. In this section, we empirically investigate1 the performance of Graph Privatizer and assess the trade-off between edge privacy and GNN accuracy in node classification tasks across several datasets. We use a variety of GNN architectures that include traditional convolutional GNNs, graph attention networks, and transformer networks, and perform experiments on the most commonly used benchmark datasets for node classification that include citation, co-purchase, and social networks.
Researcher Affiliation Academia Rucha Bhalchandra Joshi EMAIL National Institute of Science Education and Research, Bhubaneswar Homi Bhabha National Institute, Mumbai Patrick Indri EMAIL Research Unit Machine Learning TU Wien, Vienna Subhankar Mishra EMAIL National Institute of Science Education and Research, Bhubaneswar Homi Bhabha National Institute, Mumbai
Pseudocode Yes Algorithm 1 Perturb neighborhood Algorithm 2 Query Similar Algorithm 3 Most-similar neighbor Algorithm 4 Threshold-based similar neighbors
Open Source Code Yes Code available at github.com/pindri/gnn-structural-privacy.
Open Datasets Yes We experiment with GCN (Kipf & Welling, 2017), Graph SAGE (Hamilton et al., 2017), GAT (Veličković et al., 2018), GT (a graph transformer adapted from Shi et al. (2021)), GATv2 (Brody et al., 2022) and Graph Conv (the graph convolution operator introduced in Morris et al. (2019)) on the Cora (Yang et al., 2016), Pubmed (Yang et al., 2016), Last FM (Rozemberczki & Sarkar, 2020), Facebook (Rozemberczki et al., 2021), and Amazon Photo (Shchur et al., 2018) datasets.
Dataset Splits Yes For all experiments, we divide our dataset with 50:25:25 train:validation:test set ratios, similarly to Sajadmanesh & Gatica-Perez (2021).
Hardware Specification No The computational results presented have been achieved in part using the Vienna Scientific Cluster (VSC) and NISER:RIN4001.
Software Dependencies No The paper mentions various GNN models (GCN, Graph SAGE, GAT, GT, GATv2, Graph Conv) and algorithms (Drop algorithm, KProp hyperparameters), but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We train all the models for 100 epochs, with learning rate 10 2, weight decay 10 3, and dropout rate 0.5. We run experiments with edge privacy budget ϵ = {0.1, 1, 2, 8, }, α = {0, 0.25, 0.5, 0.75, 1.0}, and δ = {0, 0.1, 0.25, 0.5, 1}. We perform label and feature perturbations as described in Section 4.3, with privacy budgets fixed at ϵx = 3 and ϵy = 3. Additionally, we set the KProp hyperparameters in the Drop algorithm to the best values described in Sajadmanesh & Gatica-Perez (2021): we use Kx = 16, Ky = 2 for Cora, Kx = 4, Ky = 2 for Facebook, and Kx = 16, Ky = 0 for Last FM and Pub Med. After a grid-search tuning, we use Kx = 4, Ky = 2 for Amazon Photo.