Context Steering: Controllable Personalization at Inference Time
Authors: Zhiyang He, Sashrika Pandey, Mariah Schrum, Anca Dragan
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of Co S on generating personalized recommendations, showing that it offers more reliable control compared to turn-based and prompt-based methods. Additionally, we explore Co S as a Bayesian Generative model for inferring the relationship between open-ended texts, which can be applied to tasks such as intent classification. Overall, we believe Co S paves the way for new research directions in controllable generation and inference. We investigate how Co S enhances personalization, mitigates biases, and quantifies the level of contextual information in the application of online hate tweets. For this section, we focus on using llama-2-7B and llama-2-7B-chat as our LLMs. We extend to other open models in Sec. 4. 3.1 EXPERIMENT: GENERATING PERSONALIZED SUMMARIZATIONS 3.2 EXPERIMENT: CLASSIFYING AND QUANTIFYING IMPLICIT HATE IN TWEETS |
| Researcher Affiliation | Academia | Jerry Zhi-Yang He , Sashrika Pandey*, Mariah L. Schrum & Anca Dragan UC Berkeley EMAIL |
| Pseudocode | No | The paper describes the methodology using mathematical equations (Eq. 2, Eq. 3, Eq. 4, Eq. 5) and descriptive text, but it does not include a dedicated section or block explicitly labeled as "Pseudocode" or "Algorithm". |
| Open Source Code | Yes | Our code is available publicly at https://github.com/sashrikap/context-steering. |
| Open Datasets | Yes | We use the Implicit Hate Dataset (El Sherief et al., 2021). To understand how Co S affects the factuality of the LLMs, we use the dataset Openbook QA (Mihaylov et al., 2018) to evaluate its factuality. We evaluate Co S on La MP (Salemi et al., 2024), a personalization benchmark which provides two evaluation categories: text classification and text generation. The Bias Benchmark for QA (BBQ) dataset (Parrish et al., 2022) consists of ambiguous multiple-choice questions that capture implicit biases across various demographics, such as age, gender, and religion. |
| Dataset Splits | Yes | We conducted our experiments on a randomly selected subset comprising 75% of the data from each subject in BBQ. |
| Hardware Specification | No | The paper mentions using various language models such as Llama2-7b-Chat, Mistral, T0pp, GPT-J, Olmo-7b, GPT-4, and GPT-3.5, but it does not specify the underlying hardware (e.g., GPU models, CPU types, or memory) used to run the experiments with these models. |
| Software Dependencies | No | The paper mentions using the Hugging Face API for loading models ("Our results can be replicated by loading models via the open-source Hugging Face API (https://huggingface.co/)."), but it does not specify version numbers for Hugging Face, Python, or any other critical software libraries or environments required for replication. |
| Experiment Setup | Yes | Our summarizations are generated with llama-2-7B-chat using default sampling hyperparameters. We used the prompts proposed by Bai et al. (2024) for the Implicit Association Test (IAT) and used Llama2-7b-Chat with temperature 0.7 and default parameters otherwise. We used a temperature of 0.7 and default hyperparameters otherwise for every text model. |