reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BANGS: Game-theoretic Node Selection for Graph Self-Training

Authors: Fangxin Wang, Kay Liu, Sourav Medya, Philip Yu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments. Experimental results validate the effectiveness of BANGS across various datasets and base models. By theoretically linking random walk and feature propagation, we enhance the scalability of our approach. Additionally, we demonstrate the effectiveness of BANGS under noisy labels and varying portion of training data.
Researcher Affiliation	Academia	Fangxin Wang, Kay Liu, Sourav Medya, Philip S. Yu Department of Computer Science University of Illinois Chicago fwang51, zliu234, medya, EMAIL
Pseudocode	Yes	C ALGORITHM FORMULATION In this section, we provide pseudo-code in Algorithm 1 and pipeline figure in Figure 3 for our method, BANGS.
Open Source Code	Yes	The codebase is available on https://github.com/fangxin-wang/BANGS.
Open Datasets	Yes	We test baseline methods and our method on five graph datasets: for Cora, Citeseer, and Pub Med (Yang et al., 2016), we follow their official split; as for Last FM (Rozemberczki & Sarkar, 2020), Flickr (Zeng et al., 2019), we split them in a similar portion that training, validation, and test data take 5%, 15%, and 80%, respectively. The datasets could be found in: Cora, Citeseer and Pub Med (Yang et al., 2016) (https://github.com/ kimiyoung/planetoid); Last FM (Rozemberczki & Sarkar, 2020) (https://github.com/ benedekrozemberczki/FEATHER). Flickr (Zeng et al., 2019) (https://github.com/Graph SAINT/Graph SAINT). We employ the re-packaged datasets from Py G (Fey & Lenssen, 2019) (https://github.com/ pyg-team/pytorch_geometric, version 2.5.2).
Dataset Splits	Yes	for Cora, Citeseer, and Pub Med (Yang et al., 2016), we follow their official split; as for Last FM (Rozemberczki & Sarkar, 2020), Flickr (Zeng et al., 2019), we split them in a similar portion that training, validation, and test data take 5%, 15%, and 80%, respectively.
Hardware Specification	Yes	The experiments are mainly running in a machine with NVIDIA Ge Force GTX 4090 Ti GPU with 24 GB memory, and 80 GB main memory. Some experiments of small graphs are conducted on a Mac Book Pro with Apple M1 Pro Chip with 16 GB memory.
Software Dependencies	Yes	We employ the re-packaged datasets from Py G (Fey & Lenssen, 2019) (https://github.com/ pyg-team/pytorch_geometric, version 2.5.2).
Experiment Setup	Yes	The base model is set to Graph Convolutional Network (GCN) (Kipf & Welling, 2016) by default, while we also include results for other GNN models. ... For a fair comparison, we select the suggested hyperparameters for all baseline methods, especially in the node selection criterion. For instance, we use the suggested confidence threshold by Ca GCN, e.g., 0.8 for Cora and 0.9 for Citeseer. We set the max iteration number as 40, and use validation data to early stop. For node selection, we sample 500 times for calculating Banzhaf values. The two varying hyperparameters are the number of candidate nodes K and selected nodes k in each iteration. The value of k is set as 100 for small-scale graphs, i.e., Cora, Citeseer and Pub Med, and 400 for other larger graphs; K = k + 100.