reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Discovering Spoofing Attempts on Language Model Watermarks

Authors: Thibaud Gloaguen, Nikola Jovanović, Robin Staab, Martin Vechev

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present the results of our experimental evaluation. In 5.1, we validate the normality assumptions from 4.1. In 5.2, we validate the control of Type 1 error and evaluate the power of the tests from 4 on both spoofing techniques introduced in 2: Stealing (Jovanovi c et al., 2024) and Distillation (Gu et al., 2024).
Researcher Affiliation	Academia	1ETH Zurich. Correspondence to: Thibaud Gloaguen <EMAIL>.
Pseudocode	No	The paper describes methods and statistics, but does not include a dedicated pseudocode block or algorithm figure.
Open Source Code	No	The paper states 'We make all our code available here.' but does not provide a direct link to the code repository in the provided text.
Open Datasets	Yes	We use text continuations of prompts sampled from the news-like C4 dataset (Raffel et al., 2020). We also use prompts sampled from Dolly (Conover et al., 2023), Wikipedia (Wikimedia Foundation), Repliqa (Monteiro et al., 2024), and Math (Fourrier et al., 2023).
Dataset Splits	Yes	For each parameter combination, we generate 10,000 continuations, each being between 50 and 400 tokens long. Then, we concatenate continuations (see 4.2) to reach the targeted token length T.
Hardware Specification	No	The paper mentions using LLAMA2-7B, MISTRAL-7B, and PYTHIA-1.4B models, but does not specify the hardware (GPUs, CPUs, etc.) used to run the experiments.
Software Dependencies	No	The paper mentions various LLMs and datasets, but does not provide specific software dependencies or library version numbers (e.g., Python, PyTorch, CUDA versions) used for implementation.
Experiment Setup	Yes	We primarily focus on the KGW Sum Hash scheme, using a context size h {1, 2, 3} and γ = 0.25. For h {1, 2}, we set δ = 2. For h = 3, we use δ = 4 for Stealing... We use LLAMA2-7B as the watermarked model. For the spoofer LM, we use MISTRAL-7B as the attacker for Stealing and PYTHIA-1.4B as the attacker for Distillation. For each parameter combination, we generate 10,000 continuations, each being between 50 and 400 tokens long.