reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Context Steering: Controllable Personalization at Inference Time

Authors: Zhiyang He, Sashrika Pandey, Mariah Schrum, Anca Dragan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of Co S on generating personalized recommendations, showing that it offers more reliable control compared to turn-based and prompt-based methods. Additionally, we explore Co S as a Bayesian Generative model for inferring the relationship between open-ended texts, which can be applied to tasks such as intent classification. Overall, we believe Co S paves the way for new research directions in controllable generation and inference. We investigate how Co S enhances personalization, mitigates biases, and quantifies the level of contextual information in the application of online hate tweets. For this section, we focus on using llama-2-7B and llama-2-7B-chat as our LLMs. We extend to other open models in Sec. 4. 3.1 EXPERIMENT: GENERATING PERSONALIZED SUMMARIZATIONS 3.2 EXPERIMENT: CLASSIFYING AND QUANTIFYING IMPLICIT HATE IN TWEETS
Researcher Affiliation	Academia	Jerry Zhi-Yang He , Sashrika Pandey*, Mariah L. Schrum & Anca Dragan UC Berkeley EMAIL
Pseudocode	No	The paper describes the methodology using mathematical equations (Eq. 2, Eq. 3, Eq. 4, Eq. 5) and descriptive text, but it does not include a dedicated section or block explicitly labeled as "Pseudocode" or "Algorithm".
Open Source Code	Yes	Our code is available publicly at https://github.com/sashrikap/context-steering.
Open Datasets	Yes	We use the Implicit Hate Dataset (El Sherief et al., 2021). To understand how Co S affects the factuality of the LLMs, we use the dataset Openbook QA (Mihaylov et al., 2018) to evaluate its factuality. We evaluate Co S on La MP (Salemi et al., 2024), a personalization benchmark which provides two evaluation categories: text classification and text generation. The Bias Benchmark for QA (BBQ) dataset (Parrish et al., 2022) consists of ambiguous multiple-choice questions that capture implicit biases across various demographics, such as age, gender, and religion.
Dataset Splits	Yes	We conducted our experiments on a randomly selected subset comprising 75% of the data from each subject in BBQ.
Hardware Specification	No	The paper mentions using various language models such as Llama2-7b-Chat, Mistral, T0pp, GPT-J, Olmo-7b, GPT-4, and GPT-3.5, but it does not specify the underlying hardware (e.g., GPU models, CPU types, or memory) used to run the experiments with these models.
Software Dependencies	No	The paper mentions using the Hugging Face API for loading models ("Our results can be replicated by loading models via the open-source Hugging Face API (https://huggingface.co/)."), but it does not specify version numbers for Hugging Face, Python, or any other critical software libraries or environments required for replication.
Experiment Setup	Yes	Our summarizations are generated with llama-2-7B-chat using default sampling hyperparameters. We used the prompts proposed by Bai et al. (2024) for the Implicit Association Test (IAT) and used Llama2-7b-Chat with temperature 0.7 and default parameters otherwise. We used a temperature of 0.7 and default hyperparameters otherwise for every text model.