Instance-Aware Graph Prompt Learning

Authors: Jiazheng Li, Jundong Li, Chuxu Zhang

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on multiple datasets and settings showcase the superior performance of IA-GPL compared to state-of-the-art baselines. The paper includes a dedicated '5 Experiments' section, tables presenting ROC-AUC scores (Table 1, 2, 3, 7, 8, 9, 10), and figures for visualization and ablation studies (Figure 5, 6, 7), all indicative of empirical validation.
Researcher Affiliation Academia Jiazheng Li EMAIL University of Connecticut Jundong Li EMAIL University of Virginia Chuxu Zhang EMAIL University of Connecticut. All authors are affiliated with universities, and their email addresses use the '.edu' domain.
Pseudocode No The paper describes the methodology using mathematical formulations (e.g., equations 1-5, 7-10, 13-19) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor are there any structured, code-like formatted procedures.
Open Source Code Yes The code is publicly available: https://github.com/lijiazheng0917/IA-GPL.
Open Datasets Yes For graph-level tasks, we use eight molecular datasets from Molecule Net (Wu et al., 2018). For node-level tasks, we use three citation datasets from Yang et al. (2016). These datasets vary in size, labels, and domains, serving as a comprehensive benchmark for our evaluations. A comprehensive description of these datasets can be found in Appendix A. For molecular datasets, during the pre-training process, we sample 2 million unlabeled molecules from the ZINC15 (Sterling & Irwin, 2015) database, along with 256K labeled molecules from the preprocessed ChEMBL (Mayr et al., 2018; Gaulton et al., 2011) dataset.
Dataset Splits Yes To evaluate the performance of IA-GPL in both in-domain and out-of-domain scenarios, we split the molecular datasets in two distinct manners: random split and scaffold split. We report results in both full-shot and few-shot settings, utilizing the ROC-AUC score as the metric. We perform five rounds of experiments and report the mean and standard deviation. Table 1, for example, shows '50-shot ROC-AUC (%) performance comparison on molecular prediction benchmarks using random split.' The paper also mentions varying 'the number of shots within the range of [5,10,20,30]'.
Hardware Specification Yes We compute the training time per epoch and GPU memory consumption on the Tox Cast dataset using a single Nvidia RTX 3090. All the experiments are conducted using NVIDIA V100 graphic cards with 32 GB of memory and PyTorch framework.
Software Dependencies No The paper mentions using 'PyTorch framework' but does not provide specific version numbers for PyTorch or any other software dependencies. The statement 'PyTorch framework' alone is insufficient to confirm specific ancillary software details with version numbers.
Experiment Setup Yes Table 11 presents the hyperparameter settings used during the adaptation stage of pre-trained GNN models on downstream tasks in IA-GPL. For molecular datasets, we adopt the widely used 5-layer GIN (Xu et al., 2018) as the underlying architecture for our models. For citation networks, we adopt 2-layer Graph Transformers (Yun et al., 2019) as the underlying architecture. Grid search is used to find the best set of hyperparameters. We keep all hyperparameters the same including batch size, dimensions, etc.