Continuous Language Model Interpolation yields Dynamic and Controllable Text Generation
Authors: Sara Kangaslahti, David Alvarez-Melis
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that varying the interpolation weights yields predictable and consistent change in the model outputs with respect to all of the controlled attributes simultaneously. We evaluate the ability of weight interpolation to control the outputs of LLMs on five commonly used style attributes defined in prior style transfer literature (Jin et al., 2022). |
| Researcher Affiliation | Academia | Sara Kangaslahti EMAIL School of Engineering and Applied Sciences Harvard University David Alvarez-Melis EMAIL Kempner Institute Harvard University |
| Pseudocode | No | The paper includes mathematical equations (e.g., Equation 1, 2, 3) and descriptive text for its methods but does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present procedures in a code-like structured format. |
| Open Source Code | Yes | Code: https://github.com/skangasl/continuous-lm-interpolation |
| Open Datasets | Yes | For simplicity, we use the Tiny Stories dataset (Eldan & Li, 2023) to fine-tune a simple model and novel chapters from the Book Sum dataset (Kryscinski et al., 2021) to fine-tune a complex model. We use the documents classified as formal and informal in Grammarly s Yahoo Answers Formality Corpus (GYAFC) dataset (Rao & Tetreault, 2018) to fine-tune formal and informal models. For the politeness attribute, we use the documents in the highest and lowest politeness class in the work by Madaan et al. (2020) for fine-tuning polite and impolite models, respectively. We fine-tune positive and negative sentiment models using the Stanford Sentiment Treebank (SST-2) dataset (Socher et al., 2013). For humor, we use the Flickr Style dataset (Gan et al., 2017) to fine-tune humorous and non-humorous models. To evaluate the interpolated models, we use a subset of 1k randomly sampled prompts from the Writing Prompts dataset (Fan et al., 2018) and generate 3 continuations for each prompt. We also compute perplexity on the test split of the Wiki Text dataset (Merity et al., 2016). |
| Dataset Splits | Yes | Table 3: Fine-tuning splits. We report the number of examples from each attribute dataset used to fine-tune Llama2-7b generation and RoBERTa attribute scoring models. Each split is sampled from the combined train, test, and validation set. Domain Llama2 split size RoBERTa split size Class 0 Class 1 Sentiment Socher et al. (2013) 25k 30k 10k Politeness Madaan et al. (2020) 78k 100k 20k Formality Rao & Tetreault (2018) 104k 104k 10k Simplicity (Kryscinski et al., 2021; Eldan & Li, 2023) 9k 100k 10k Humor Gan et al. (2017) 100k 100k 20k |
| Hardware Specification | Yes | All experiments were run on single NVIDIA A100 80GB SXM GPU nodes. |
| Software Dependencies | No | The paper mentions using specific models like Llama2-7b and RoBERTa, and techniques like LoRA. However, it does not provide specific version numbers for ancillary software such as programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch), or other libraries used in the implementation of the experiments. |
| Experiment Setup | Yes | Table 2: Parameters for LoRA fine-tuning. We use 20 epochs for fine-tuning the sentiment attribute models and 1 epoch for the remaining fine-tuned models. LoRA hyperparameter Value Batch size 64 Learning rate 5e-5 LoRA r 32 LoRA α 16 LoRA dropout 0.1 Max sequence length 128 Quantization 4 bit |