reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cross-View Graph Consistency Learning for Invariant Graph Representations

Authors: Jie Chen, Hua Mao, Wai Lok Woo, Chuanbin Liu, Xi Peng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct extensive experiments to evaluate the link prediction performance of the proposed CGCL method. The source code for CGCL is implemented upon a Py Torch framework and a Py Torch Geometric (Py G) library. All experiments are performed on a Linux workstation with a Ge Force RTX 4090 GPU (24-GB caches), an Intel (R) Xeon (R) Platinum 8336C CPU and 128.0 GB of RAM.
Researcher Affiliation	Academia	Jie Chen1, 4, Hua Mao2, Wai Lok Woo2, Chuanbin Liu,3*, Xi Peng1 1College of Computer Science, Sichuan University 2Department of Computer and Information Sciences, Northumbria University 3Center for Scientific Research and Development in Higher Education Institutes, Ministry of Education 4National Key Laboratory of Fundamental Algorithms and Models for Engineering Numerical Simulation, Sichuan University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Optimization Procedure for CGCL Input: Data matrix X, the adjacency matrix of an incomplete graph structure A, and parameters λ and dv. Initialize: epochs = 800; 1: for t = 1 to epochs do 2: Constructing two complementary augmented views A1 and A2 from A; 3: for v = 1 to 2 do 4: i = 1 if v == 2 else 2; 5: Computing Z(v) via Eq. (5) using X and Av; 6: Constructing the set E i by sampling the negative edges from the other augmented view Ai; 7: Computing H(v) via Eq. (6) using three variables, including Z(v), the set of edges constructed from Ai, and the set of negative edges E i; 8: Computing f Ai via Eq. (7); 9: Updating W(1) and W(1,2) by minimizing Lv in Eq. (8) using Ai and f Ai; 10: end for 11: end for 12: Computing Z via Eq. (5) using X and A; 13: Computing H via Eq. (6); 14: Computing e A via Eq. (7); Output: The graph structure e A.
Open Source Code	No	The source code for CGCL is implemented upon a Py Torch framework and a Py Torch Geometric (Py G) library.
Open Datasets	Yes	Datasets We select five widely used graph datasets for evaluation, including Cora (Sen et al. 2008), Citeseer (Sen et al. 2008), Pubmed (Namata et al. 2012), Photo (Mc Auley et al. 2015), and Computers (Mc Auley et al. 2015), which are publicly available on Py G.
Dataset Splits	Yes	Each graph dataset is divided into three parts, including a training set, a validation set and a testing set. We use two different sets of percentages for the validation set and testing set, including (1) 5% of the validation set and 10% of the testing set and (2) 10% of the validation set and 20% of the testing set. The links in the validation and testing sets are masked in the training graph structure. For example, we randomly select 5% and 10% of the links and the same numbers of disconnected node pairs as testing and validation sets under the first setting, respectively. The remainder of the links in the graph structure are used for training.
Hardware Specification	Yes	All experiments are performed on a Linux workstation with a Ge Force RTX 4090 GPU (24-GB caches), an Intel (R) Xeon (R) Platinum 8336C CPU and 128.0 GB of RAM.
Software Dependencies	No	The source code for CGCL is implemented upon a Py Torch framework and a Py Torch Geometric (Py G) library.
Experiment Setup	Yes	The proposed network architecture contains 2 hidden layers in the CGCL model. The sizes of the 2 hidden layers are set to [dv, dv/2], where dv is the number of neural units in the first hidden layer. In the experiments, dv ranges within {512, 256, 128, 64}. The learning rate of the proposed CGCL method r is chosen from 1e 3, 5e 3, 0.01, 0.05 . For all datasets, the number of iterations is set to 800 during the training stage. To conduct a fair comparison, the best link prediction results of these competing methods are obtained by tuning their parameters.