Task-Agnostic Language Model Watermarking via High Entropy Passthrough Layers
Authors: Vaden Masrani, Mohammad Akbari, David Ming Xuan Yue, Ahmad Rezaei, Yong Zhang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed passthrough layers on a wide range of downstream tasks, and show experimentally our watermarking method achieves a near-perfect watermark extraction accuracy and false-positive rate in most cases without damaging original model performance. |
| Researcher Affiliation | Industry | Huawei Technologies Canada Co. Ltd. EMAIL |
| Pseudocode | No | The paper describes methods using mathematical equations and descriptive text, but no distinct pseudocode or algorithm blocks are explicitly provided or labeled. |
| Open Source Code | Yes | Code https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=58b799a0-5cfc-4c2e-8b9b440bb2315264 |
| Open Datasets | Yes | Following (Gu et al. 2023), we validate our method across 4 classification tasks and 7 datasets: SST2 (Socher et al. 2013), IMDB (Maas et al. 2011), SNLI (Bowman et al. 2015), MNLI (Williams, Nangia, and Bowman 2018), AGNews (Zhang, Zhao, and Le Cun 2015), News Group (NG) (Lang 1995), and PAWS (Zhang, Baldridge, and He 2019), covering sentiment, entailment, and paraphrase detection, and topic classification tasks. |
| Dataset Splits | No | The paper mentions fine-tuning for a certain number of epochs and using a pruning ratio, but does not provide specific train/test/validation dataset splits (e.g., exact percentages or sample counts) for reproducibility in the main text. It states, "Hyperparameter settings for each stage and additional details about how metrics are calculated are given in the Appendix (Masrani et al. 2024)", suggesting these details might be elsewhere. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only implies that computational resources were used by mentioning 'Wallclock times are reported in Table 1'. |
| Software Dependencies | No | The paper mentions using publicly available PLMs from Hugging Face and specific models like BERT-based-uncased, GPT-2, and Llama2-7B. However, it does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, or Hugging Face Transformers library versions). |
| Experiment Setup | Yes | We add 1 passthrough layer at position {3,5,8} (PTL-358) to the pretrained BERT, and train it for 10K steps. All layers except the passthrough layers, head, and last layer are frozen. [...] we use GPT-2 with 124M parameters. [...] We add passthrough layers at positions {1}, {1,4,7}, and {1,3,5,7,9}, and train for 100k steps on the Open Web Text. [...] We fine-tune BERT described in the Classification Tasks Section for 10 epochs over 5 downstream tasks. [...] with a pruning ratio of 50% [...] followed by a fine-tuning round for 1 epoch |