reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adversity-aware Few-shot Named Entity Recognition via Augmentation Learning

Authors: Li Huang, Haowen Liu, Qiang Gao, Jiajing Yu, Guisong Liu, Xueqin Chen

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on both certain and uncertain datasets, including few-shot and cross-domain conditions, demonstrate the superiority and robustness of the proposed AAL compared to state-of-the-art baselines. Performance on Certainty Condition. Table 1 and Table 2 present the performance comparison of our AAL against baselines on SNIPS and Cross-Dataset under certain conditions. AAL achieves an average enhancement of 6.27% and 3.04% in overall results for the 1-shot and 5-shot scenarios, outperforming the robust baseline MANNER. Performance on Uncertainty Condition. Table 3 and Table 4 report the performance of our AAL alongside baselines on SNIPS and Cross-Dataset following the application of Bert-Attack adversarial algorithm to the target domain data.
Researcher Affiliation	Academia	1School of Computing and Artiﬁcial Intelligence, Southwestern University of Finance and Economics, Chengdu, China 2Engineering Research Center of Intelligent Finance, Ministry of Education, Chengdu, China 3Kash Institute of Electronics and Information Industry, Kashgar, China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using mathematical formulations and textual descriptions of steps, but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	For reproduction, the source code is released at https://github. com/swufe-Nice Lab-Geo Text/AAL.git.
Open Datasets	Yes	To align with previous studies, we conduct experiments on the following datasets: (1) SNIPS (Coucke et al. 2018): It has 7 domains with different label sets and a small number of samples, with a relatively even number of samples per domain per label set, which makes it easy to simulate a small number of samples. (2) Cross-Dataset (Hou et al. 2020): It is constructed from datasets from four different domains: Co NLL-2003 (Tjong Kim Sang 2002), GUM (Zeldes 2017), WNUT-2017 (Derczynski et al. 2017), and Ontonotes (Pradhan et al. 2013).
Dataset Splits	Yes	Few-shot NER on Episode Learning. Given the source domain Ds = {(Ss, Qs)}, the task of few-shot NER should adapt to the target domain of Dt = {(St, Qt)}. Under episode learning, each episode consists of a support set Ss/t = {(x(i) s/t, y(i) s/t)}N K i=1 for adaption, and a query set Qs/t = {(x(j) s/t, y(j) s/t)}N K j=1 for evaluation. Here, N denotes the number of entity types in an episode, K and K denote the number of examples per entity type in support set and query set, respectively, commonly referred N-way K-shot setting (Ding et al. 2021). Typically, K is very small, often K = 1 or 5.
Hardware Specification	Yes	We employ the Adam W optimizer for AAL, accelerated on an NVIDIA A100 GPU.
Software Dependencies	No	The paper mentions using 'Uncased-Bert base model as the PLM' and 'Adam W optimizer' but does not specify version numbers for any software libraries or dependencies like Python, PyTorch, TensorFlow, etc.
Experiment Setup	Yes	Implementation Details. We exploit the Uncased-Bert base model as the PLM. {dm, dz, ne, α, γ} and dropout are set to {768, 128, 5, 0.5, 0.5, 0.1}. We employ the Adam W optimizer for AAL, accelerated on an NVIDIA A100 GPU.