GraphPrivatizer: Improved Structural Differential Privacy for Graph Neural Networks
Authors: Rucha Bhalchandra Joshi, Patrick Indri, Subhankar Mishra
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on real-world graph datasets and empirically evaluate the privacy of our models against privacy attacks. In this section, we empirically investigate1 the performance of Graph Privatizer and assess the trade-off between edge privacy and GNN accuracy in node classification tasks across several datasets. We use a variety of GNN architectures that include traditional convolutional GNNs, graph attention networks, and transformer networks, and perform experiments on the most commonly used benchmark datasets for node classification that include citation, co-purchase, and social networks. |
| Researcher Affiliation | Academia | Rucha Bhalchandra Joshi EMAIL National Institute of Science Education and Research, Bhubaneswar Homi Bhabha National Institute, Mumbai Patrick Indri EMAIL Research Unit Machine Learning TU Wien, Vienna Subhankar Mishra EMAIL National Institute of Science Education and Research, Bhubaneswar Homi Bhabha National Institute, Mumbai |
| Pseudocode | Yes | Algorithm 1 Perturb neighborhood Algorithm 2 Query Similar Algorithm 3 Most-similar neighbor Algorithm 4 Threshold-based similar neighbors |
| Open Source Code | Yes | Code available at github.com/pindri/gnn-structural-privacy. |
| Open Datasets | Yes | We experiment with GCN (Kipf & Welling, 2017), Graph SAGE (Hamilton et al., 2017), GAT (Veličković et al., 2018), GT (a graph transformer adapted from Shi et al. (2021)), GATv2 (Brody et al., 2022) and Graph Conv (the graph convolution operator introduced in Morris et al. (2019)) on the Cora (Yang et al., 2016), Pubmed (Yang et al., 2016), Last FM (Rozemberczki & Sarkar, 2020), Facebook (Rozemberczki et al., 2021), and Amazon Photo (Shchur et al., 2018) datasets. |
| Dataset Splits | Yes | For all experiments, we divide our dataset with 50:25:25 train:validation:test set ratios, similarly to Sajadmanesh & Gatica-Perez (2021). |
| Hardware Specification | No | The computational results presented have been achieved in part using the Vienna Scientific Cluster (VSC) and NISER:RIN4001. |
| Software Dependencies | No | The paper mentions various GNN models (GCN, Graph SAGE, GAT, GT, GATv2, Graph Conv) and algorithms (Drop algorithm, KProp hyperparameters), but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We train all the models for 100 epochs, with learning rate 10 2, weight decay 10 3, and dropout rate 0.5. We run experiments with edge privacy budget ϵ = {0.1, 1, 2, 8, }, α = {0, 0.25, 0.5, 0.75, 1.0}, and δ = {0, 0.1, 0.25, 0.5, 1}. We perform label and feature perturbations as described in Section 4.3, with privacy budgets fixed at ϵx = 3 and ϵy = 3. Additionally, we set the KProp hyperparameters in the Drop algorithm to the best values described in Sajadmanesh & Gatica-Perez (2021): we use Kx = 16, Ky = 2 for Cora, Kx = 4, Ky = 2 for Facebook, and Kx = 16, Ky = 0 for Last FM and Pub Med. After a grid-search tuning, we use Kx = 4, Ky = 2 for Amazon Photo. |