Fine-tuning can Help Detect Pretraining Data from Large Language Models
Authors: Hengxiang Zhang, Songxin Zhang, Bingyi Jing, Hongxin Wei
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of our method, significantly improving the AUC score on common benchmark datasets across various models. To validate the effectiveness of our method, we conduct extensive experiments on various datasets, including Wiki MIA, Book MIA (Shi et al., 2024), Ar Xiv Tection, Book Tection (Duarte et al., 2024) and Pile (Maini et al., 2024). The results demonstrate that our method can significantly improve the performance of existing methods based on scoring functions. |
| Researcher Affiliation | Academia | Hengxiang Zhang1, Songxin Zhang1, Bingyi Jing1, Hongxin Wei1 1Department of Statistics and Data Science, Southern University of Science and Technology |
| Pseudocode | No | The paper describes the Fine-tuned Score Deviation (FSD) method through prose and mathematical formulations (Equations 4 and 5) but does not include a distinct pseudocode or algorithm block. |
| Open Source Code | Yes | Our code is available at https://github.com/ml-stat-Sustech/ Fine-tuned-Score-Deviation. |
| Open Datasets | Yes | To verify the effectiveness of detection methods, we employ four common benchmark datasets for evaluations, including Wiki MIA (Shi et al., 2024), Ar Xiv Tection (Duarte et al., 2024), Book Tection (Duarte et al., 2024) Book MIA (Shi et al., 2024) and Pile (Maini et al., 2024). Previous works have demonstrated that model developers commonly use text content among those datasets for pre-training (Shi et al., 2024; Duarte et al., 2024; Ye et al., 2024). The datasets are provided by Hugging Face3, and detailed information of datasets is presented in Appendix B. |
| Dataset Splits | Yes | For constructing the non-member dataset, we randomly sample 30% of the data from the entire dataset and select all non-members from this subset as the constructed fine-tuning dataset. The remaining 70% of the dataset is used for testing. We conduct experiments of copyrighted book detection on Book MIA and Book Tection, we randomly sample 30% of the dataset and select all non-members from this subset as the fine-tuning dataset. Subsequently, we randomly sample 500 members and non-members from the remaining 70% of the datasets, constructing a balanced validation set of 1,000 examples for evaluation. The detailed information of the constructed dataset is shown in Table 8 and Table 9. |
| Hardware Specification | Yes | We conduct all experiments on NVIDIA L40 GPU and implement all methods with default parameters using Py Torch (Paszke et al., 2019). |
| Software Dependencies | No | We conduct all experiments on NVIDIA L40 GPU and implement all methods with default parameters using Py Torch (Paszke et al., 2019). The paper mentions PyTorch and cites its paper but does not specify a version number for PyTorch or any other software dependency. |
| Experiment Setup | Yes | We employ Lo RA (Hu et al., 2022) to fine-tune the base model with 3 epochs and a batch size of 8. We set the initial learning rate as 0.001 and drop it by cosine scheduling strategy. |