Automated Detection of Pre-training Text in Black-box LLMs
Authors: Ruihan Hu, Yu-Ming Shang, Jiankun Peng, Wei Luo, Yazhe Wang, Xi Zhang
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive evaluations on three widely used datasets demonstrate that our framework is effective and superior in the black-box setting. Section 5 is titled "Experiments" and describes the experimental setup, datasets, and results. |
| Researcher Affiliation | Academia | 1Key Laboratory of Trustworthy Distributed Computing and Service (Mo E), Beijing University of Posts and Telecommunications, China 2Zhongguancun Laboratory, China EMAIL, EMAIL, EMAIL. The email domains 'bupt.edu.cn' and 'zgclab.edu.cn' indicate academic or public research affiliations. |
| Pseudocode | No | The paper describes methods using text and figures, but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Code: github.com/STAIR-BUPT/Veil Probe Corresponding Author |
| Open Datasets | Yes | Wiki MIA [Shi et al., 2024] consists of Wikipedia event snippets. Book Tection [Duarte et al., 2024] is a widely adopted dataset that contains 165 copyrighted books, which is expanded based on Book MIA [Shi et al., 2024]. ar Xiv Tection [Duarte et al., 2024] includes classic papers from ar Xiv. |
| Dataset Splits | Yes | We randomly sample approximately 50 ground-truth samples per dataset to train the prototype-based classifier, with the remaining samples serving as the texts to be detected. |
| Hardware Specification | Yes | In our work, all experiments are implemented on a workstation with five NVIDIA Tesla V100 32G GPUs, and Ubuntu22.04.4. |
| Software Dependencies | No | The paper mentions 'Ubuntu22.04.4' as the operating system, but does not provide specific versions for key software components or libraries (e.g., Python, PyTorch, CUDA, etc.) used in the experiments. |
| Experiment Setup | Yes | For each text to be detected, three suffixes were generated using the target LLM, with the maximum suffix length set to 512 tokens. The parameter γ is set to 10 for obtaining the perturbed text r. The p-value significance threshold is chosen from {0.001, 0.01, 0.05, 0.1} to select the critical perturbation calibration features. We randomly sample approximately 50 ground-truth samples per dataset to train the prototype-based classifier. |