When Do LLMs Help With Node Classification? A Comprehensive Analysis

Authors: Xixi Wu, Yifei Shen, Fangzhou Ge, Caihua Shan, Yizhu Jiao, Xiangguo Sun, Hong Cheng

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Subsequently, we conducted extensive experiments, training and evaluating over 2,700 models, to determine the key settings (e.g., learning paradigms and homophily) and components (e.g., model size and prompt) that affect performance.
Researcher Affiliation Collaboration 1Department of Systems Engineering and Engineering Management, and Shun Hing Institute of Advanced Engineering, The Chinese University of Hong Kong 2Microsoft Research Asia 3University of Illinois Urbana-Champaign.
Pseudocode No The paper describes methods and approaches in prose, but there are no structured blocks explicitly labeled as pseudocode or algorithm.
Open Source Code Yes Codes and datasets are released at https://llmnodebed.github.io/.
Open Datasets Yes Codes and datasets are released at https://llmnodebed.github.io/. ... The processed data is publicly available at https://huggingface.co/datasets/xxwu/LLMNode Bed. ... Cora and Pubmed (He et al., 2024), Citeseer (Chen et al., 2024b), and Wiki CS (Liu et al., 2024). The remaining datasets already include text attributes in their official releases, including ar Xiv (Hu et al., 2020), Instagram and Reddit (Huang et al., 2024), Books, Computer, and Photo (Yan et al., 2023), Cornell, Texas, Wisconsin, and Washington (Wang et al., 2025).
Dataset Splits Yes For experimental datasets, we adopt the official splits designed for semi-supervised settings to ensure standardized evaluation. ... Specifically, we use a 60% training, 20% validation, and 20% testing split for most datasets. ... Detailed data splits are provided in Table 10 in the Appendix. ... For heterophilic graphs... For dataset splits, we assign semi-supervised and supervised settings with 1:1:8 and 6:2:2 splits for training, validation, and test sets, respectively.
Hardware Specification Yes All measurements were conducted on a single NVIDIA H100-80G GPU to ensure consistency. ... All recorded experiment times are based on a single NVIDIA H100-80G GPU. ... GPU Device: 1 NVIDIA A6000-48G ... 2 NVIDIA A6000-48G
Software Dependencies No We release LLMNode Bed, a Py G-based testbed designed to facilitate reproducible and rigorous research in LLM-based node classification algorithms. ... Open-source models can be easily loaded via the Transformers library.
Experiment Setup Yes For GNNs with arbitrary input embeddings... we perform a grid-search on the hyperparameters as follows: num layers in [2, 3, 4], hidden dimension in [32, 64, 128, 256], and dropout in [0.3, 0.5, 0.7]. ... For both GNNs and MLPs across experimental datasets, the learning rate is consistently set to 1e 2, following previous studies... The total number of epochs is set to 500 with a patience of 100. ... For Sen BERT-66M and Ro BERTa-355M, we set the training epochs to 10 for semi-supervised settings and 4 for supervised settings. The batch size is set to 32, and the learning rate is set to 2e 5.