reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An Efficient Private GPT Never Autoregressively Decodes

Authors: Zhengyi Li, Yue Guan, Kang Yang, Yu Feng, Ning Liu, Yu Yu, Jingwen Leng, Minyi Guo

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate a 2.1 6.0 speedup compared to standard decoding across three pairs of public-private models and different network conditions.
Researcher Affiliation	Academia	1Shanghai Jiao Tong University 2Shanghai Qizhi Institute 3State Key Laboratory of Cryptology. Correspondence to: Jingwen Leng <EMAIL>, Kang Yang <EMAIL>, Yu Yu <EMAIL>.
Pseudocode	Yes	Algorithm 1 Privately Reject Draft Tokens
Open Source Code	No	The paper does not provide an explicit statement about releasing their own implementation code or a direct link to a code repository for the POST approach. It mentions using existing frameworks like Secret Flow-SPU and protocols like Bumble Bee and Nimbus, but these are third-party tools or prior works.
Open Datasets	Yes	We evaluate performance across four diverse tasks: Textto-SQL (Spider) (Yu et al., 2018), graduate school math (Gsm8k) (Cobbe et al., 2021), Python code generation (Code-search-Python) (Husain et al., 2019), financial question answering (Alpaca-finance) (Gaurang Bharti, 2024).
Dataset Splits	No	The paper discusses different tasks and models but does not explicitly state the training, validation, or test dataset splits (e.g., percentages or specific counts) for these datasets or for the knowledge distillation process.
Hardware Specification	No	Performance evaluations are conducted on two nodes with 64 v CPUs and 128 GB memory.
Software Dependencies	No	The paper mentions using 'Bumble Bee (Lu et al., 2025) and Nimbus (Li et al., 2024b)' and 'Secret Flow-SPU (Ma et al., 2023)' as frameworks and protocols, but does not provide specific version numbers for these or other key software components (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	No	The paper mentions simulating network conditions ('(1 Gbps, 10 ms) for LAN and (400 Mbps, 40 ms) for WAN') and using cross-entropy for model alignment, but it does not specify concrete hyperparameters like learning rates, batch sizes, optimizers, or number of epochs for model training or fine-tuning, which are crucial for reproducing the experimental setup.