Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

Authors: Zhaowei Zhang, Fengshuo Bai, Qizhi Chen, Chengdong Ma, Mingzhi Wang, Haoran Sun, Zilong Zheng, Yaodong Yang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The detailed experimental results demonstrate that Amulet can achieve significant performance improvements in rich settings with combinations of different LLMs, datasets, and user preferences, while maintaining acceptable computational efficiency. ... In this section, we conduct extensive experiments to evaluate Amulet with various combinations of LLMs, datasets, and user preferences. Our results demonstrate that our framework significantly improves LLMs alignment performance, indicating its great potential for real-time user preference adaptation.
Researcher Affiliation Academia 1Institute for Artificial Intelligence, Peking University 2State Key Laboratory of General Artificial Intelligence, BIGAI 3Shanghai Jiao Tong University 4Zhongguancun Academy
Pseudocode Yes We have further provided the pseudo code for showing the details of the full decoding process with Amulet in Algorithm 1.
Open Source Code No The paper does not provide a specific link or explicit statement about releasing the source code for the Amulet framework. The links provided are for external tools/benchmarks used in the evaluation (e.g., Hugging Face, OpenAI GPT-4o, choix Python library).
Open Datasets Yes Help Steer (Wang et al., 2023)... Ultra Feedback (Cui et al., 2023)... Truthful QA (Lin et al., 2021)... Ultra Chat (Ding et al., 2023)... Personal Preference Eval (Personal) (Gao et al., 2024)
Dataset Splits Yes Help Steer is a QA dataset... We extracted the question part, focusing on single-sentence questions to create a dataset of 1,236 testing instances. ... Truthful QA (Lin et al., 2021), which includes 811 testing problems... Ultra Chat (Ding et al., 2023), from which we applied similar extraction and filtering as with Help Steer, resulting in 3,845 testing problems. ... Personal Preference Eval (Personal) (Gao et al., 2024) ... containing 548 testing instances.
Hardware Specification Yes We conducted experiments on an Ubuntu 20.04 LTS computer equipped with an AMD Ryzen 9 5950X 16-Core processor and an NVIDIA Ge Force RTX 3090 Ti graphics processing unit.
Software Dependencies No The paper mentions using the 'transformers library' and 'the Python library choix' but does not specify their version numbers.
Experiment Setup Yes Iteration Number T. We conduct experiments using 0, 20, 40, 60, 80, and 100 iterations. ... Learning Rate η. We conduct the experiments ranging from 2, 4, . . . , 20. ... Parameter α and λ. We conduct experiments of both the parameters ranging from 1, 2, . . . , 10.