reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fully-inductive Node Classification on Arbitrary Graphs

Authors: Jianan Zhao, Zhaocheng Zhu, Mikhail Galkin, Hesham Mostafa, Michael Bronstein, Jian Tang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, Graph Any trained on a single Wisconsin dataset with only 120 labeled nodes can generalize to 30 new graphs with an average accuracy of 67.26%, surpassing not only all inductive baselines, but also strong transductive methods trained separately on each of the 30 test graphs. ... 4 EXPERIMENTS In this section, we evaluate the performance of Graph Any against both transductive and inductive methods on 31 node classification datasets (details in Appendix B).
Researcher Affiliation	Collaboration	1Mila Québec AI Institute, 2Université de Montréal, 3Google Research 4Intel Labs 5University of Oxford, 6AITHYRA, 7HEC Montréal, 8CIFAR AI Chair
Pseudocode	No	The paper describes methods and processes using mathematical formulations and textual descriptions, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured, code-like steps for any procedure.
Open Source Code	Yes	Equal contribution. Code release: https://github.com/Deep Graph Learning/Graph Any
Open Datasets	Yes	4.1 EXPERIMENTAL SETUP Datasets. We have compiled a diverse collection of 31 node classification datasets from three sources: Py G (Fey and Lenssen, 2019), DGL (Wang et al., 2020), and OGB (Hu et al., 2021b). These datasets encompass a wide range of graph types including academic collaboration networks, social networks, e-commerce networks and knowledge graphs, with sizes varying from a few hundreds to a few millions of nodes. The number of classes across these datasets ranges from 2 to 70. Detailed statistics for each dataset are provided in Appendix B.
Dataset Splits	Yes	B COMPLETE DATASET INFORMATION We follow the default split if there is a given one, otherwise, we use the standard semi-supervised setting (Kipf and Welling, 2017) where 20 nodes are randomly selected as training nodes for each label. The detailed dataset information is summarized in Table 3. ... Table 3: ...Train/Val/Test Ratios (%)
Hardware Specification	Yes	C IMPLEMENTATION DETAILS All experiments were conducted using five different random seeds: {0, 1, 2, 3, 4}. The best hyperparameters were selected based on the validation accuracy. The runtime measurements presented in Table 1 were performed on an NVIDIA Quadro RTX 8000 GPU with CUDA version 12.2, supported by an AMD EPYC 7502 32-Core Processor that features 64 cores and a maximum clock speed of 2.5 GHz.
Software Dependencies	Yes	The runtime measurements presented in Table 1 were performed on an NVIDIA Quadro RTX 8000 GPU with CUDA version 12.2, supported by an AMD EPYC 7502 32-Core Processor that features 64 cores and a maximum clock speed of 2.5 GHz.
Experiment Setup	Yes	C.1 GRAPHANY IMPLEMENTATION ... The hyperparameter search space for Graph Any is relatively small, we fixed the batch size as 128 and varied the number of training batches with options of 500, 1000, and 1500; explored hidden dimensions of 32, 64, and 128; tested configurations with 1, 2, and 3 MLP layers; and set the fixed entropy value H at 1 and 2. The optimal settings derived from this hyperparameter search space are detailed in Table 4. Table 4: Summary of hyperparameters of Graph Any on different datasets. Dataset # Batches Learning Rate Hidden Dimension # MLP Layers Entropy Cora 500 0.0002 64 1 2