reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

TGLsta: Low-resource Textual Graph Learning with Semantic and Topological Awareness via LLMs

Authors: Qin Zhang, Xiaowei Li, Ziqi Liu, Xiaochen Fan, Xiaojun Chen, Shirui Pan

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated the performance of the TGLsta framework by conducting experiments on four publicly available graph-based text corpora: Cora (Mc Callum et al. 2000), Art, Industrial, and Music Instruments (M.I.) (Ni, Li, and Mc Auley 2019). In terms of the metrics, we adopt macro-F1 and Accuracy that widely used for classification problems.
Researcher Affiliation	Academia	1College of Computer Science and Software Engineering, Shenzhen University, China 2 Institute for Electronics and Information Technology in Tianjin, Tsinghua University, China 3School of Information and Communication Technology, Griffith University, Australia.
Pseudocode	No	The paper describes the methodology using mathematical equations and descriptive text, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	We evaluated the performance of the TGLsta framework by conducting experiments on four publicly available graph-based text corpora: Cora (Mc Callum et al. 2000), Art, Industrial, and Music Instruments (M.I.) (Ni, Li, and Mc Auley 2019).
Dataset Splits	Yes	In the few-shot classification setting, we use a 5-way approach, where each task involves selecting five classes from the full set. For each class, we sample S examples, where S {0, 1, . . . , 5}, to form the S-shot support set used for model training. The corresponding validation set is of the same size as the support set, while the remaining examples are assigned to the query set, which is unlabeled and used for evaluation.
Hardware Specification	Yes	The experiments were performed on a workstation with an Intel(R) Xeon(R) Gold 6226R CPU and an Nvidia A100 GPU.
Software Dependencies	No	For the experiments, we utilize GCN as the core neural network for the graph encoder, which consists of two hidden layers, each with 128 dimensions and Leaky Re LU activation. We employ a transformer as the text encoder (Vaswani 2017). In line with CLIP (Radford et al. 2021), our setup features a 63-million-parameter model with 12 layers, each 512 units wide, and equipped with 8 attention heads. It utilizes a lower-cased byte pair encoding (BPE) scheme to represent texts, with a vocabulary size of 49,152 (Sennrich, Haddow, and Birch 2016). We cap the maximum sequence length at 128. The paper mentions software components but does not provide specific version numbers for reproducibility.
Experiment Setup	Yes	For the experiments, we utilize GCN as the core neural network for the graph encoder, which consists of two hidden layers, each with 128 dimensions and Leaky Re LU activation. We employ a transformer as the text encoder (Vaswani 2017). In line with CLIP (Radford et al. 2021), our setup features a 63-million-parameter model with 12 layers, each 512 units wide, and equipped with 8 attention heads. It utilizes a lower-cased byte pair encoding (BPE) scheme to represent texts, with a vocabulary size of 49,152 (Sennrich, Haddow, and Birch 2016). We cap the maximum sequence length at 128. Subsequently, in our later experiments, we adjust k to 2 and K to 256. λ R+ is a hyperparameter to balance the contribution from summary-based pairs. Overall, our findings highlight the critical role of graph information in low-resource node classification, given that graph structures encapsulate rich relationships between documents. ... reaching its peak when k = 5 and K = 128.