reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GraphPrivatizer: Improved Structural Differential Privacy for Graph Neural Networks

Authors: Rucha Bhalchandra Joshi, Patrick Indri, Subhankar Mishra

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on real-world graph datasets and empirically evaluate the privacy of our models against privacy attacks. In this section, we empirically investigate1 the performance of Graph Privatizer and assess the trade-off between edge privacy and GNN accuracy in node classification tasks across several datasets. We use a variety of GNN architectures that include traditional convolutional GNNs, graph attention networks, and transformer networks, and perform experiments on the most commonly used benchmark datasets for node classification that include citation, co-purchase, and social networks.
Researcher Affiliation	Academia	Rucha Bhalchandra Joshi EMAIL National Institute of Science Education and Research, Bhubaneswar Homi Bhabha National Institute, Mumbai Patrick Indri EMAIL Research Unit Machine Learning TU Wien, Vienna Subhankar Mishra EMAIL National Institute of Science Education and Research, Bhubaneswar Homi Bhabha National Institute, Mumbai
Pseudocode	Yes	Algorithm 1 Perturb neighborhood Algorithm 2 Query Similar Algorithm 3 Most-similar neighbor Algorithm 4 Threshold-based similar neighbors
Open Source Code	Yes	Code available at github.com/pindri/gnn-structural-privacy.
Open Datasets	Yes	We experiment with GCN (Kipf & Welling, 2017), Graph SAGE (Hamilton et al., 2017), GAT (Veličković et al., 2018), GT (a graph transformer adapted from Shi et al. (2021)), GATv2 (Brody et al., 2022) and Graph Conv (the graph convolution operator introduced in Morris et al. (2019)) on the Cora (Yang et al., 2016), Pubmed (Yang et al., 2016), Last FM (Rozemberczki & Sarkar, 2020), Facebook (Rozemberczki et al., 2021), and Amazon Photo (Shchur et al., 2018) datasets.
Dataset Splits	Yes	For all experiments, we divide our dataset with 50:25:25 train:validation:test set ratios, similarly to Sajadmanesh & Gatica-Perez (2021).
Hardware Specification	No	The computational results presented have been achieved in part using the Vienna Scientific Cluster (VSC) and NISER:RIN4001.
Software Dependencies	No	The paper mentions various GNN models (GCN, Graph SAGE, GAT, GT, GATv2, Graph Conv) and algorithms (Drop algorithm, KProp hyperparameters), but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We train all the models for 100 epochs, with learning rate 10 2, weight decay 10 3, and dropout rate 0.5. We run experiments with edge privacy budget ϵ = {0.1, 1, 2, 8, }, α = {0, 0.25, 0.5, 0.75, 1.0}, and δ = {0, 0.1, 0.25, 0.5, 1}. We perform label and feature perturbations as described in Section 4.3, with privacy budgets fixed at ϵx = 3 and ϵy = 3. Additionally, we set the KProp hyperparameters in the Drop algorithm to the best values described in Sajadmanesh & Gatica-Perez (2021): we use Kx = 16, Ky = 2 for Cora, Kx = 4, Ky = 2 for Facebook, and Kx = 16, Ky = 0 for Last FM and Pub Med. After a grid-search tuning, we use Kx = 4, Ky = 2 for Amazon Photo.