reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

AIA: Autoregression-Based Injection Attacks Against Text2SQL Models

Authors: Deyin Li, Xiang Ling, Changjiang Li, Xiang Chen, Chunming Wu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation demonstrates that AIA can cause Text2SQL models to generate target output by adversarial inputs with success rates of over 70% in most scenarios. The generated adversarial input has certain transferability in target Text2SQL models. Additionally, practice experiments show that AIA can make Text2SQL models extract user lists from databases and even delete data in databases directly. We evaluate the performance of AIA on a public SQL injection dataset. With 10-20-character payloads, AIA can achieve a success rate of more than 70%, surpassing the success rate of other possible attack methods. The adversarial input of AIA has about 40% transferability between target models.
Researcher Affiliation	Academia	1Zhejiang University 2Institute of Software, Chinese Academy of Sciences 3Stony Brook University
Pseudocode	Yes	Algorithm 1: Column Selection Input: Target model M, column candidates D, other parts of input X, random target payload token length N Output: Selected column O 1: Randomly generate N tokens: R 2: S = [] 3: for C in D do 4: L = 1 5: Continue = True 6: while Continue do 7: P = R[:L] 8: if P in M(X + C + P) then 9: L ++ 10: else 11: Continue = False 12: S.append(L) 13: end if 14: end while 15: end for 16: Obtain the index of the maximum value in S: I 17: O = D[I] 18: return O
Open Source Code	No	The paper does not contain any explicit statement about releasing code, a link to a code repository, or mention of code in supplementary materials for the methodology described.
Open Datasets	Yes	We select our database from a public SQL injection dataset from Kaggle1, which includes SQL injection and normal samples. We selected SQL injection samples and removed duplicate samples from its two versions, thereby obtaining a total of 22,226 samples as our dataset. 1https://www.kaggle.com/datasets/syedsaqlainhussain/sqlinjection-dataset
Dataset Splits	Yes	To evaluate the effectiveness of AIA and compare it with Seq2Sick and TAA, we randomly selected 500 samples from our database as target payloads to attack the two target models and we evaluated the ASR(T), ASR(E) and TPTL, as reported in Tabel 1. Particularly, Seq2Sick and TAA map their adversarial examples to token space after each gradient backpropagation. ... We divided the selected samples into four 100-sample groups: 10-20, 20-30, 30-40, and 40-50 character length.
Hardware Specification	No	The paper discusses model architectures (e.g., T5-base, LLAMA3-8B) and their use, but does not provide specific details about the hardware (CPU, GPU models, memory, etc.) on which the experiments were conducted or models were trained.
Software Dependencies	No	The paper mentions several models and frameworks such as T5, GPT-2, Chat GPT, and LLAMA3-8B, and implies the use of deep learning libraries for gradient backpropagation. However, it does not specify version numbers for any software dependencies or libraries used in the implementation or experimentation.
Experiment Setup	No	The paper defines a loss function and describes the gradient backpropagation process for optimization. However, it does not provide specific hyperparameter values such as learning rate, batch size, number of epochs, specific optimizer used, or details on model initialization or training schedules, which are necessary for exact reproduction of the experimental setup.