reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Few-Shot, No Problem: Descriptive Continual Relation Extraction

Authors: Nguyen Xuan Thanh, Anh Duc Le, Quyen Tran, Thanh-Thien Le, Linh Ngo Van, Thien Huu Nguyen

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on multiple datasets demonstrate that our method significantly advances the state-of-the-art by maintaining robust performance across sequential tasks, effectively addressing catastrophic forgetting.
Researcher Affiliation	Collaboration	1Oraichain Labs, 2Hanoi University of Science and Technology, 3Vin AI Research, 4University of Oregon
Pseudocode	Yes	Algorithm 1 outlines the end-to-end training process at each task T j, with Φj 1 denoting the model after training on the previous j 1 tasks. In line with memory-based continual learning methods, we maintain a memory buffer Mj 1 that stores a few representative samples from all previous tasks T 1, . . . , T j 1, along with a relation description set Ej 1 that holds the descriptions of all previously encountered relations.
Open Source Code	No	The paper does not contain any explicit statement about providing access to source code or a link to a code repository.
Open Datasets	Yes	We conduct experiments using two pre-trained language models, BERT (Devlin et al. 2019) and LLM2Vec (Behnam Ghader et al. 2024), on two widely used benchmark datasets for Relation Extraction: Few Rel (Han et al. 2018) and TACRED (Zhang et al. 2017).
Dataset Splits	Yes	In Few-Shot Continual Relation Extraction (FCRE), a model must continuously assimilate new knowledge from a sequential series of tasks. For each t-th task, the model undergoes training on the dataset Dt = {(xt i, yt i)}N K i=1 . Here, N represents the number of relations in the task Rt, and K denotes the limited number of samples per relation, reflecting the few-shot learning scenario. This type of task setup is referred to as N-way-K-shot (Chen, Wu, and Shi 2023a).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running the experiments.
Software Dependencies	No	The paper mentions using BERT and LLM2Vec as pre-trained language models but does not provide specific version numbers for any software dependencies or libraries used in their implementation.
Experiment Setup	Yes	The soft prompt phrase length is set to 3 tokens, meaning n0, n1, n2 and n3 correspond to the values of 3, 6, 9, and 12, respectively.