reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Noise-robust Graph Learning by Estimating and Leveraging Pairwise Interactions

Authors: Xuefeng Du, Tian Bian, Yu Rong, Bo Han, Tongliang Liu, Tingyang Xu, Wenbing Huang, Yixuan Li, Junzhou Huang

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on different datasets and GNN architectures demonstrate the effectiveness of PI-GNN, yielding a promising improvement over the state-of-the-art methods. Code is publicly available at https://github.com/Tian Bian95/pi-gnn. ... In this section, we present empirical evidence to validate the effectiveness of PI-GNN on different datasets with different noise types and ratios. ... Table 1: Test accuracy on 5 datasets for PI-GNN with GCN as the backbone. ... Figure 3: Test accuracy of PI-GNN and comparison with PI-GNN w/o pc and vanilla GNN on two additional model architectures under different noisy settings.
Researcher Affiliation	Collaboration	Xuefeng Du EMAIL University of Wisconsin-Madison Tian Bian EMAIL The Chinese University of Hong Kong Yu Rong EMAIL Tencent AI Lab Bo Han EMAIL Hong Kong Baptist University Tongliang Liu EMAIL Mohamed bin Zayed University of Artificial Intelligence The University of Sydney Tingyang Xu EMAIL Tencent AI Lab Wenbing Huang EMAIL Renmin University of China Yixuan Li EMAIL University of Wisconsin-Madison Junzhou Huang EMAIL University of Texas at Arlington
Pseudocode	Yes	Algorithm 1 PI-GNN: Noise-robust Graph Learning by Estimating and Leveraging Pairwise Interactions Input: Input graph G = (V, A, X) with noisy training data D tr = {(A, Xv, yv)}v V , randomly initialized GNNs fe and ft with parameter θe and θt, weight for regularization loss β, pretraining epoch K for fe. Total training epoch N. Output: Robust GNN ft. for epoch = 0; epoch < N; epoch + + do if epoch K then Update the parameter θe of the PI label estimation model fe by Equation 3. Set β = 0 in Equation 5, update the parameter θt of the node classification model ft. else Update the parameter θe of the PI label estimation model fe by Equation 3. Estimate the PI label y PI by Equation 4 with fe. Update the parameter θt of the node classification model ft by Equation 5. end end return The node classification model ft.
Open Source Code	Yes	Code is publicly available at https://github.com/Tian Bian95/pi-gnn.
Open Datasets	Yes	We used five datasets to evaluate PI-GNN, including Cora, Cite Seer and Pub Med with the default dataset split as in (Kipf & Welling, 2017) and Wiki CS dataset (Mernyei & Cangea, 2020) as well as OGB-arxiv dataset (Hu et al., 2020).
Dataset Splits	Yes	We used five datasets to evaluate PI-GNN, including Cora, Cite Seer and Pub Med with the default dataset split as in (Kipf & Welling, 2017) and Wiki CS dataset (Mernyei & Cangea, 2020) as well as OGB-arxiv dataset (Hu et al., 2020). For Wiki CS, we used the first 20 nodes from each class for training and the next 20 nodes for validation. The remaining nodes for each class are used as the test set. For OGB-arxiv, we use the default split.
Hardware Specification	Yes	We trained for 400 epochs on a Tesla P40.
Software Dependencies	Yes	We run all experiments with Python 3.8.5 and Py Torch 1.7.0, using NVIDIA TESLA P40 GPUs. ... We used three different GNN architectures, i.e., GCN, GAT and Graph SAGE, which are implemented by torch-geometric 2 (Fey & Lenssen, 2019).
Experiment Setup	Yes	Specifically, the hidden dimension of GCN, GAT and Graph SAGE is set to 16, 8 and 64. GAT has 8 attention heads in the first layer and 1 head in the second layer. The mean aggregator is used for Graph SAGE. We applied Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.01 for GCN and Graph SAGE and 0.005 for GAT. The weight decay is set to 5e-4. We trained for 400 epochs on a Tesla P40. The loss weight β is set to \|V \|2/(\|V \|2 Q)2, where \|V \| is the number of nodes and Q is the sum of all elements of the preprocessed adjacency matrix. The number of pretraining epochs K is set to 50 and the total epoch N is set to 400. For subgraph sampling, we sampled 15 and 10 neighbors for each node in the 1st and 2nd layer of the GNN and set the batch size to be 1024.