Personalization of Large Language Models: A Survey
Authors: Zhehao Zhang, Ryan A. Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe Barrow, Tong Yu, Sungchul Kim, Ruiyi Zhang, Jiuxiang Gu, Tyler Derr, Hongjie Chen, Junda Wu, Xiang Chen, Zichao Wang, Subrata Mitra, Nedim Lipka, Nesreen K. Ahmed, Yu Wang
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we bridge the gap between these two separate main directions for the first time by introducing a taxonomy for personalized LLM usage and summarizing the key differences and challenges. We provide a formalization of the foundations of personalized LLMs that consolidates and expands notions of personalization of LLMs, defining and discussing novel facets of personalization, usage, and desiderata of personalized LLMs. We then unify the literature across these diverse fields and usage scenarios by proposing systematic taxonomies for the granularity of personalization, personalization techniques, datasets, evaluation methods, and applications of personalized LLMs. Finally, we highlight challenges and important open problems that remain to be addressed. By unifying and surveying recent research using the proposed taxonomies, we aim to provide a clear guide to the existing literature and different facets of personalization in LLMs, empowering both researchers and practitioners. |
| Researcher Affiliation | Collaboration | 1Dartmouth College 2Adobe Research 3Stanford University 4University of Massachusetts Amherst 5Pattern Data 6Vanderbilt University 7Dolby Research 8University of California San Diego 9Cisco Research 10University of Oregon |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes methodologies and taxonomies in narrative text. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this survey. It refers to third-party tools or frameworks, e.g., "Auto GPT, 2024. URL https://github.com/Significant-Gravitas/Auto GPT." and "Microsoft, 2024", but not code implemented by the authors for their own work. |
| Open Datasets | Yes | The paper extensively discusses and cites various datasets, indicating their public availability through formal citations. For example: "Movie Lens-1M recommendation dataset (Harper & Konstan, 2015)", "Amazon Review Data (Ni et al., 2019)", "Conv AI2 (Dinan et al., 2020)", "La MP (Salemi et al., 2023)", and "PRISM Alignment Dataset (Kirk et al., 2024)". |
| Dataset Splits | No | The paper does not provide specific dataset split information for its own methodology, as it is a survey and does not conduct original experiments requiring such splits. While Table 5 shows dataset sizes and splits for *other* works (e.g., "News Headline 13K/2K/2K"), these are not splits defined or used by the authors for their own experimental reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. As a survey paper, it describes existing research and does not report on new experimental results that would require hardware specifications. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. It discusses various models and frameworks (e.g., "T5 (Raffel et al., 2020)", "BERT (Devlin et al., 2018)") in the context of other research, but not as software dependencies for its own work. |
| Experiment Setup | No | The paper does not contain specific experimental setup details such as hyperparameters or training configurations. As a survey, it describes techniques and findings from other research rather than conducting its own experiments with specific settings. |