Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Authors: Aiwei Liu, Sheng Guan, Yiming Liu, Leyi Pan, Yifei Zhang, Liancheng Fang, Lijie Wen, Philip Yu, Xuming Hu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that almost all mainstream watermarking algorithms are easily identified with our well-designed prompts, while Water-Probe demonstrates a minimal false positive rate for non-watermarked LLMs. In our experiments, we demonstrate that the Water-Probe algorithm achieves high accuracy in detecting various types of watermarked LLMs. 4 EXPERIMENT ON WATERMARKED LLM IDENTIFICATION |
| Researcher Affiliation | Academia | 1 Tsinghua University 2 Beijing University of Posts and Telecommunications 3 The Chinese University of Hong Kong 4 University of Illinois at Chicago 5 Hongkong University of Science and Technology (Guangzhou) |
| Pseudocode | Yes | We provide the detailed steps of the Water-Probe algorithm in Algorithm 1 in the appendix. |
| Open Source Code | Yes | [Official]:https://github.com/THU-BPM/Watermarked_LLM_Identification |
| Open Datasets | Yes | For watermarked text detection, we used OPT-2.7B to generate texts on the C4 dataset (Raffel et al., 2020) |
| Dataset Splits | No | No explicit training/test/validation dataset splits are provided for the Water-Probe algorithm's evaluation or for the main LLM identification task. The C4 dataset is mentioned for generating texts in a separate watermarked text detection context, not for defining splits for the primary experimental setup. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions using the 'Mark LLM (Pan et al., 2024) framework' but does not specify its version number or other software dependencies with their versions. |
| Experiment Setup | Yes | For all LLMs, the sampling temperature was set to 1, with the number of samples set to 104. ... We set ยต = 0.1 for our experiments. |