reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Undetectable Steganography for Language Models

Authors: Or Zamir

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	While this paper is theoretical in nature and the properties of the suggested scheme are proven rigorously, we also implemented the scheme and provide empirical examples. In Section 7 we discuss our implementation of the scheme and some empirical evaluation of it. In Figure 2, we estimate the number of message bits we can hide in a response of a certain length. For each length of response, we ran our scheme for 100 times using the LLM model GPT-2 (RWC+19) on a randomly chosen prompt from the list of example prompts provided by Open AI on their GPT-2 webpage.
Researcher Affiliation	Academia	Or Zamir EMAIL School of Computer Science Tel Aviv University
Pseudocode	Yes	The pseudo-codes for generation (Algorithm 1) and detection (Algorithm 2) of the watermark appear in the Appendix. In CGZ, those algorithms are then generalized to also support the detection of the watermark from a substring out of the response and not only from the response in its entirety as is sketched above. ... Algorithm 3: One-query steganography algorithm Stegk ... Algorithm 4: One-query retriever Retrk ... Algorithm 5: Steganography algorithm Stegk ... Algorithm 6: Retriever algorithm Retrk
Open Source Code	Yes	1Code available at: https://github.com/Or Zamir/steg
Open Datasets	No	The paper uses LLM models (GPT-2, Llama 2) to generate text for evaluation, but does not explicitly use a pre-existing dataset for experiments or provide a dataset for public access. The evaluation relies on generating responses from these models using prompts.
Dataset Splits	No	The paper describes experiments where responses are generated using LLMs. It does not involve traditional dataset splitting for training, validation, or testing of a model, as its focus is on embedding information into LLM outputs.
Hardware Specification	No	The paper mentions using LLM models such as GPT-2 and Llama 2 for its empirical evaluations, but it does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) on which these experiments were run.
Software Dependencies	No	The paper mentions using LLM models (GPT-2, Llama 2) but does not provide specific version numbers for any software libraries, frameworks, or programming languages used in the implementation of their scheme.
Experiment Setup	Yes	We ran it with threshold parameter t = 2, which we didn’t optimize.