reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

AGALE: A Graph-Aware Continual Learning Evaluation Framework

Authors: Tianqi Zhao, Alan Hanjalic, Megha Khosla

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive experiments comparing methods from the domains of continual learning, continual graph learning, and dynamic graph learning (DGL). We theoretically analyze AGALE and provide new insights about the role of homophily in the performance of compared methods. We release our framework at https://github.com/Tianqi-py/AGALE.
Researcher Affiliation	Academia	Tianqi Zhao EMAIL Delft University of Technology Delft, Netherlands Alan Hanjalic EMAIL Delft University of Technology Delft, Netherlands Megha Khosla EMAIL Delft University of Technology Delft, Netherlands
Pseudocode	Yes	Algorithm 1 Task Sequence and Subgraph Sequence Generation Algorithm 2 Train and Test Partition Algorithm Within One Subgraph
Open Source Code	Yes	We release our framework at https://github.com/Tianqi-py/AGALE.
Open Datasets	Yes	We demonstrate our evaluation framework on 3 multi-label datasets in this work. We also include 1 multi-class dataset Cora Full as an example to demonstrate the generalization of our evaluation framework on single-label nodes. We include the description of the Cora Full and the results on it in the Appendix A.2. Below, we introduce the datasets used in this work: 1. PCG(Zhao et al., 2023), in which nodes are proteins and edges correspond to the protein functional interaction, and the labels the phenotype of the proteins. 2. DBLP(Akujuobi et al., 2019), in which nodes represent authors and edges the co-authorship between the authors, and the labels indicate the research areas of the authors. 3. Yelp(Zeng et al., 2019), in which nodes correspond to the customer reviews and edges to their friendships with node labels representing the types of businesses.
Dataset Splits	Yes	Construction of train/val/test sets. To overcome the current limitations of generating train/val/test sets as discussed in Section 1.1, we employ Algorithm 2 to partition nodes of a given graph snapshot Gt. For the given subgraph Gt, our objective is to maintain the pre-established ratios for training, validation, and test data for both the task as a whole and individual classes within the task.
Hardware Specification	No	The paper mentions "Note that the running time of the experiments can be biased due to different splits and how the resources are distributed on the computer." but does not provide specific hardware details like CPU, GPU models, or memory specifications.
Software Dependencies	No	The paper mentions "The CL methods use Graph Convolutional Network (GCN) (Kipf & Welling, 2016) as the backbone." but does not provide specific software dependencies with version numbers, such as Python, PyTorch, TensorFlow, or CUDA versions.
Experiment Setup	No	The paper discusses the models used and general experimental design but lacks specific hyperparameters (e.g., learning rate, batch size, number of epochs) for training these models. For instance, in Section 5, it states, "In this study, we employ P = 3, indicating that we generate three random orders for the classes in each dataset in the experimental section," which is about dataset generation, not specific training hyperparameters.