Personalization of Large Language Models: A Survey

Authors: Zhehao Zhang, Ryan A. Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe Barrow, Tong Yu, Sungchul Kim, Ruiyi Zhang, Jiuxiang Gu, Tyler Derr, Hongjie Chen, Junda Wu, Xiang Chen, Zichao Wang, Subrata Mitra, Nedim Lipka, Nesreen K. Ahmed, Yu Wang

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we bridge the gap between these two separate main directions for the first time by introducing a taxonomy for personalized LLM usage and summarizing the key differences and challenges. We provide a formalization of the foundations of personalized LLMs that consolidates and expands notions of personalization of LLMs, defining and discussing novel facets of personalization, usage, and desiderata of personalized LLMs. We then unify the literature across these diverse fields and usage scenarios by proposing systematic taxonomies for the granularity of personalization, personalization techniques, datasets, evaluation methods, and applications of personalized LLMs. Finally, we highlight challenges and important open problems that remain to be addressed. By unifying and surveying recent research using the proposed taxonomies, we aim to provide a clear guide to the existing literature and different facets of personalization in LLMs, empowering both researchers and practitioners.
Researcher Affiliation Collaboration 1Dartmouth College 2Adobe Research 3Stanford University 4University of Massachusetts Amherst 5Pattern Data 6Vanderbilt University 7Dolby Research 8University of California San Diego 9Cisco Research 10University of Oregon
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes methodologies and taxonomies in narrative text.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this survey. It refers to third-party tools or frameworks, e.g., "Auto GPT, 2024. URL https://github.com/Significant-Gravitas/Auto GPT." and "Microsoft, 2024", but not code implemented by the authors for their own work.
Open Datasets Yes The paper extensively discusses and cites various datasets, indicating their public availability through formal citations. For example: "Movie Lens-1M recommendation dataset (Harper & Konstan, 2015)", "Amazon Review Data (Ni et al., 2019)", "Conv AI2 (Dinan et al., 2020)", "La MP (Salemi et al., 2023)", and "PRISM Alignment Dataset (Kirk et al., 2024)".
Dataset Splits No The paper does not provide specific dataset split information for its own methodology, as it is a survey and does not conduct original experiments requiring such splits. While Table 5 shows dataset sizes and splits for *other* works (e.g., "News Headline 13K/2K/2K"), these are not splits defined or used by the authors for their own experimental reproduction.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments. As a survey paper, it describes existing research and does not report on new experimental results that would require hardware specifications.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers. It discusses various models and frameworks (e.g., "T5 (Raffel et al., 2020)", "BERT (Devlin et al., 2018)") in the context of other research, but not as software dependencies for its own work.
Experiment Setup No The paper does not contain specific experimental setup details such as hyperparameters or training configurations. As a survey, it describes techniques and findings from other research rather than conducting its own experiments with specific settings.