reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Equivalence of Graph Convolution and Mixup

Authors: Xiaotian Han, Hanqing Zeng, Yu Chen, Shaoliang Nie, Jingzhou Liu, Kanika Narang, Zahra Shakeri, Karthik Abinav Sankararaman, Song Jiang, Madian Khabsa, Qifan Wang, Xia Hu

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also empirically verify the equivalence by training an MLP using the two modifications to achieve comparable performance. To verify the proposed HMLP can achieve comparable performance to the original GCN, we train the one-layer GCN (GCN-1), two-layer GCN (GCN-2), SGC, and HMLP on the training set and report the test accuracy based on the highest accuracy achieved on the validation set. We experimented with different data splits of train/validation/test (the training data ratio span from 10% 90%), and we also conducted experiments with the public split on Cora, Cite Seer, and Pub Med datasets. We present the results in Table 2 and Figure 4.
Researcher Affiliation	Collaboration	Xiaotian Han1 Hanqing Zeng2 Yu Chen3 Shaoliang Nie2 Jingzhou Liu2 Kanika Narang2 Zahra Shakeri2 Karthik Abinav Sankararaman2 Song Jiang4 Madian Khabsa2 Qifan Wang2 Xia Hu5 1Case Western Reserve University 2Meta AI 3Anytime AI 4UCLA 5Rice University
Pseudocode	No	The paper describes methods and equations in paragraph text and mathematical notation, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Corresponds to EMAIL. The code is at https://github.com/ahxt/Graph Conv_is_Mixup.
Open Datasets	Yes	The datasets used in this paper are built-in datasets in torch_geometric, which will be automatically downloaded via the torch_geometric API. In this experiment, we use widely adopted node classification benchmarks involving different types of networks: three citation networks (Cora, Cite Seer and Pub Med).
Dataset Splits	Yes	We experimented with different data splits of train/validation/test (the training data ratio span from 10% 90%)
Hardware Specification	Yes	We run all the experiments on NVIDIA RTX A100 on AWS.
Software Dependencies	No	The paper mentions 'Py Torch 3' and 'torch_geometric 4' with general project URLs (https://pytorch.org/, https://pyg.org/) in footnotes, but does not specify exact version numbers for these or any other software components.
Experiment Setup	Yes	We set the learning rate to 0.1 for all methods and datasets, and each was trained for 400 epochs.Moreover, we did not use weight decay for model training.