Undetectable Steganography for Language Models
Authors: Or Zamir
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | While this paper is theoretical in nature and the properties of the suggested scheme are proven rigorously, we also implemented the scheme and provide empirical examples. In Section 7 we discuss our implementation of the scheme and some empirical evaluation of it. In Figure 2, we estimate the number of message bits we can hide in a response of a certain length. For each length of response, we ran our scheme for 100 times using the LLM model GPT-2 (RWC+19) on a randomly chosen prompt from the list of example prompts provided by Open AI on their GPT-2 webpage. |
| Researcher Affiliation | Academia | Or Zamir EMAIL School of Computer Science Tel Aviv University |
| Pseudocode | Yes | The pseudo-codes for generation (Algorithm 1) and detection (Algorithm 2) of the watermark appear in the Appendix. In CGZ, those algorithms are then generalized to also support the detection of the watermark from a substring out of the response and not only from the response in its entirety as is sketched above. ... Algorithm 3: One-query steganography algorithm Stegk ... Algorithm 4: One-query retriever Retrk ... Algorithm 5: Steganography algorithm Stegk ... Algorithm 6: Retriever algorithm Retrk |
| Open Source Code | Yes | 1Code available at: https://github.com/Or Zamir/steg |
| Open Datasets | No | The paper uses LLM models (GPT-2, Llama 2) to generate text for evaluation, but does not explicitly use a pre-existing dataset for experiments or provide a dataset for public access. The evaluation relies on generating responses from these models using prompts. |
| Dataset Splits | No | The paper describes experiments where responses are generated using LLMs. It does not involve traditional dataset splitting for training, validation, or testing of a model, as its focus is on embedding information into LLM outputs. |
| Hardware Specification | No | The paper mentions using LLM models such as GPT-2 and Llama 2 for its empirical evaluations, but it does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) on which these experiments were run. |
| Software Dependencies | No | The paper mentions using LLM models (GPT-2, Llama 2) but does not provide specific version numbers for any software libraries, frameworks, or programming languages used in the implementation of their scheme. |
| Experiment Setup | Yes | We ran it with threshold parameter t = 2, which we didn’t optimize. |