reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Decoupling the Depth and Scope of Graph Neural Networks

Authors: Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, Ren Chen

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, on seven graphs (with up to 110M nodes) and six backbone GNN architectures, our design achieves signiﬁcant accuracy improvement with orders of magnitude reduction in computation and hardware cost.
Researcher Affiliation	Collaboration	Hanqing Zeng USC EMAIL Muhan Zhang Peking University, BIGAI EMAIL Yinglong Xia Facebook AI EMAIL Ajitesh Srivastava USC EMAIL Andrey Malevich Facebook AI EMAIL Rajgopal Kannan US ARL EMAIL Viktor Prasanna USC EMAIL Long Jin Facebook AI EMAIL Ren Chen Facebook AI EMAIL
Pseudocode	Yes	See Appendix D and F.3 for algorithm and experiments.
Open Source Code	Yes	Our code is available at https://github.com/facebookresearch/shaDow_GNN
Open Datasets	Yes	We evaluate SHADOW-GNN on seven graphs. Six of them are for the node classiﬁcation task: Flickr [55], Reddit [12], Yelp [55], ogbn-arxiv, ogbn-products and ogbn-papers100M [16].
Dataset Splits	Yes	We follow the default data splits for all datasets, which are usually 60% training, 20% validation, and 20% test. For ogbn-papers100M, the training, validation, test splits are 80%, 10%, 10% respectively. (from Appendix E.1)
Hardware Specification	No	Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]
Software Dependencies	No	The paper mentions using W&B [5] but does not provide specific version numbers for software dependencies or libraries in the text.
Experiment Setup	Yes	All models on all datasets have uniform hidden dimension of 256. [...] For the model depth, since L = 3 is the standard setting in the literature (e.g., see the benchmarking in OGB [16]), we start from L = 3 and further evaluate a deeper model of L = 5. Hyperparameter tuning and architecture conﬁgurations are in Appendix E.4. (Appendix E.4 specifies: 'hidden dimension of 256 for all models', 'learning rate of 0.001', 'Adam optimizer [21] with weight decay of 5e-4', 'number of training epochs is 1000', 'dropout rate set to 0.5')