reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Personalization of Large Language Models: A Survey

Authors: Zhehao Zhang, Ryan A. Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe Barrow, Tong Yu, Sungchul Kim, Ruiyi Zhang, Jiuxiang Gu, Tyler Derr, Hongjie Chen, Junda Wu, Xiang Chen, Zichao Wang, Subrata Mitra, Nedim Lipka, Nesreen K. Ahmed, Yu Wang

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work, we bridge the gap between these two separate main directions for the first time by introducing a taxonomy for personalized LLM usage and summarizing the key differences and challenges. We provide a formalization of the foundations of personalized LLMs that consolidates and expands notions of personalization of LLMs, defining and discussing novel facets of personalization, usage, and desiderata of personalized LLMs. We then unify the literature across these diverse fields and usage scenarios by proposing systematic taxonomies for the granularity of personalization, personalization techniques, datasets, evaluation methods, and applications of personalized LLMs. Finally, we highlight challenges and important open problems that remain to be addressed. By unifying and surveying recent research using the proposed taxonomies, we aim to provide a clear guide to the existing literature and different facets of personalization in LLMs, empowering both researchers and practitioners.
Researcher Affiliation	Collaboration	1Dartmouth College 2Adobe Research 3Stanford University 4University of Massachusetts Amherst 5Pattern Data 6Vanderbilt University 7Dolby Research 8University of California San Diego 9Cisco Research 10University of Oregon
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes methodologies and taxonomies in narrative text.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described in this survey. It refers to third-party tools or frameworks, e.g., "Auto GPT, 2024. URL https://github.com/Significant-Gravitas/Auto GPT." and "Microsoft, 2024", but not code implemented by the authors for their own work.
Open Datasets	Yes	The paper extensively discusses and cites various datasets, indicating their public availability through formal citations. For example: "Movie Lens-1M recommendation dataset (Harper & Konstan, 2015)", "Amazon Review Data (Ni et al., 2019)", "Conv AI2 (Dinan et al., 2020)", "La MP (Salemi et al., 2023)", and "PRISM Alignment Dataset (Kirk et al., 2024)".
Dataset Splits	No	The paper does not provide specific dataset split information for its own methodology, as it is a survey and does not conduct original experiments requiring such splits. While Table 5 shows dataset sizes and splits for other works (e.g., "News Headline 13K/2K/2K"), these are not splits defined or used by the authors for their own experimental reproduction.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments. As a survey paper, it describes existing research and does not report on new experimental results that would require hardware specifications.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers. It discusses various models and frameworks (e.g., "T5 (Raffel et al., 2020)", "BERT (Devlin et al., 2018)") in the context of other research, but not as software dependencies for its own work.
Experiment Setup	No	The paper does not contain specific experimental setup details such as hyperparameters or training configurations. As a survey, it describes techniques and findings from other research rather than conducting its own experiments with specific settings.