Context Steering: Controllable Personalization at Inference Time

Authors: Zhiyang He, Sashrika Pandey, Mariah Schrum, Anca Dragan

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of Co S on generating personalized recommendations, showing that it offers more reliable control compared to turn-based and prompt-based methods. Additionally, we explore Co S as a Bayesian Generative model for inferring the relationship between open-ended texts, which can be applied to tasks such as intent classification. Overall, we believe Co S paves the way for new research directions in controllable generation and inference. We investigate how Co S enhances personalization, mitigates biases, and quantifies the level of contextual information in the application of online hate tweets. For this section, we focus on using llama-2-7B and llama-2-7B-chat as our LLMs. We extend to other open models in Sec. 4. 3.1 EXPERIMENT: GENERATING PERSONALIZED SUMMARIZATIONS 3.2 EXPERIMENT: CLASSIFYING AND QUANTIFYING IMPLICIT HATE IN TWEETS
Researcher Affiliation Academia Jerry Zhi-Yang He , Sashrika Pandey*, Mariah L. Schrum & Anca Dragan UC Berkeley EMAIL
Pseudocode No The paper describes the methodology using mathematical equations (Eq. 2, Eq. 3, Eq. 4, Eq. 5) and descriptive text, but it does not include a dedicated section or block explicitly labeled as "Pseudocode" or "Algorithm".
Open Source Code Yes Our code is available publicly at https://github.com/sashrikap/context-steering.
Open Datasets Yes We use the Implicit Hate Dataset (El Sherief et al., 2021). To understand how Co S affects the factuality of the LLMs, we use the dataset Openbook QA (Mihaylov et al., 2018) to evaluate its factuality. We evaluate Co S on La MP (Salemi et al., 2024), a personalization benchmark which provides two evaluation categories: text classification and text generation. The Bias Benchmark for QA (BBQ) dataset (Parrish et al., 2022) consists of ambiguous multiple-choice questions that capture implicit biases across various demographics, such as age, gender, and religion.
Dataset Splits Yes We conducted our experiments on a randomly selected subset comprising 75% of the data from each subject in BBQ.
Hardware Specification No The paper mentions using various language models such as Llama2-7b-Chat, Mistral, T0pp, GPT-J, Olmo-7b, GPT-4, and GPT-3.5, but it does not specify the underlying hardware (e.g., GPU models, CPU types, or memory) used to run the experiments with these models.
Software Dependencies No The paper mentions using the Hugging Face API for loading models ("Our results can be replicated by loading models via the open-source Hugging Face API (https://huggingface.co/)."), but it does not specify version numbers for Hugging Face, Python, or any other critical software libraries or environments required for replication.
Experiment Setup Yes Our summarizations are generated with llama-2-7B-chat using default sampling hyperparameters. We used the prompts proposed by Bai et al. (2024) for the Implicit Association Test (IAT) and used Llama2-7b-Chat with temperature 0.7 and default parameters otherwise. We used a temperature of 0.7 and default hyperparameters otherwise for every text model.