Scale-Free Graph-Language Models

Authors: Jianglin Lu, Yixuan Liu, Yitian Zhang, Yun Fu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on representative datasets validate our findings on the scale-free structural approximation of KNN graphs and demonstrate the effectiveness of integrating graph generation and text embedding with a real structural prior. Our code is available at https://github.com/Jianglin954/SFGL.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, Northeastern University 2Network Science Institute, Northeastern University 3Khoury College of Computer Science, Northeastern University
Pseudocode Yes Algorithm 1 Scale-Free Graph-Language Model (SFGL / SFGL+GPT) Require: Input sequences X, labels YL, the number of neighbors k, query question q, prompt pi. ... Algorithm 2 Scale-Free Graph-Language Model (SFGL+GCN / SFGL+ ::: GCN)
Open Source Code Yes Our code is available at https://github.com/Jianglin954/SFGL.
Open Datasets Yes We conduct extensive experiments on four citation networks: Cora (Sen et al., 2008), Pubmed (Kipf & Welling, 2016), ogbn-arxiv (Hu et al., 2020), and arxiv23 (He et al., 2024).
Dataset Splits Yes Table 1 presents the classification performance of various methods across different datasets and labeling rates. ... Cora 2.58%(70) 5.17%(140) 7.75%(210) 15.51%(420) 22.16%(600)
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies No For GNNs, we use a two-layer GCN (see Sec. 4.3 for comparisons with other GNN architectures). ... For LMs, we finetune a pretrained De BERTa (He et al., 2021) on the target datasets. ... We also employ GPT3.5 (Ouyang et al., 2022) for inference. The paper mentions specific models like GCN, DeBERTa, and GPT3.5 but does not provide version numbers for any underlying software libraries or frameworks used for their implementation.
Experiment Setup Yes For GNNs, we use a two-layer GCN. The hyper-parameters for hidden dimension, learning rate, dropout ratio, and weight decay are set to 128, 0.001, 0.5, and 0.0005, respectively. For LMs, we finetune a pretrained De BERTa (He et al., 2021) on the target datasets. The batch size, learning rate, and dropout ratio are set to 20, 2 × 10−5, and 0.3, respectively. We empirically set k to 25, 15, 25, 20 for Cora, Pubmed, ogbn-arxiv, and arxiv23, respectively.