reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

PersonalizedRouter: Personalized LLM Routing via Graph-based User Preference Modeling

Authors: Zhongjie Dai, Tao Feng, Jiaxuan You

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that Personalized Router significantly outperforms existing LLM selection methods and surpasses the strongest methods by a large margin of 15.38% and 9.83% under two simulation strategies. On the Persona Route-Bench with 1,000 users, it further surpasses the best methods by 16.19% and 59.69% while maintaining higher efficiency.
Researcher Affiliation	Academia	Zhongjie Dai 1,2 EMAIL University of Illinois at Urbana-Champaign Tao Feng 1 EMAIL University of Illinois at Urbana-Champaign Jiaxuan You EMAIL University of Illinois at Urbana-Champaign
Pseudocode	No	The paper describes the framework, problem formulation, and GNN updates using mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	ulab-uiuc/Personalized Router
Open Datasets	Yes	Alpaca (Taori et al., 2023), GSM8K (Cobbe et al., 2021), SQuAD (Rajpurkar, 2016), Multi-News (Fabbri et al., 2019)
Dataset Splits	Yes	For both settings, the dataset is divided into three parts, training, validation, and test sets with a ratio of 70% : 10% : 20%.
Hardware Specification	Yes	Our method is implemented using Py Torch and Py G, and all experiments are conducted on an NVIDIA A6000 48GB Tensor Core GPU.
Software Dependencies	No	Our method is implemented using Py Torch and Py G, and all experiments are conducted on an NVIDIA A6000 48GB Tensor Core GPU. The specific versions for Py Torch and Py G are not provided.
Experiment Setup	Yes	For router training, we use a two-layer graph attention network with a hidden dimension of 32. The model is trained with a batch size of 32 for up to 400 epochs. We use the Adam optimizer (Kingma & Ba, 2014) and apply a Lambda LR scheduler to gradually decay the learning rate from 1e-3 to 0 during training.