SimulPL: Aligning Human Preferences in Simultaneous Machine Translation

Authors: Donglei Yu, Yang Zhao, Jie Zhu, Yangyifan Xu, Yu Zhou, Chengqing Zong

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results indicate that Simul PL exhibits better alignment with human preferences across all latency levels in Zh En, De En and En Zh Si MT tasks. Our data and code will be available at https://github.com/Eureka For NLP/Simul PL.
Researcher Affiliation Academia Donglei Yu1 2, Yang Zhao 1 2, Jie Zhu3, Yangyifan Xu1 2, Yu Zhou1 2 , Chengqing Zong1 2 1 School of Artificial Intelligence, University of Chinese Academy of Sciences 2 State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China 3 Graduate School of Translation and Interpretation, Beijing Foreign Studies University EMAIL EMAIL EMAIL
Pseudocode Yes Algorithm 1: Confidence-based Policy In Inference
Open Source Code Yes Our data and code will be available at https://github.com/Eureka For NLP/Simul PL.
Open Datasets Yes We validate our method on text-to-text Si MT tasks using our annotated datasets with human-preferred references. For the training data, we select subsets from three datasets WMT15 De En, WMT22 Zh En, and MUST-C En Zh for annotation.
Dataset Splits Yes Table 1: Statics of our constructed datasets. We present the reference-free COMET scores of our annotated target sentences with GPT-4/4o and the original target sentences. Dataset Size train test Zh En 13,491 2,000 De En 15,717 2,168 En Zh 19,967 2,841
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU/CPU models or processor types.
Software Dependencies No The paper mentions software like Fairseq, Llama2-7B-chat, Simuleval, awesome-align, and Stanza, but does not provide specific version numbers for these software components, which is required for reproducibility.
Experiment Setup Yes Table 7: Hyper-parameters of Transformerbased Si MT models in our experiments. encoder layers 6 encoder attention heads 8 encoder embed dim 512 encoder ffn embed dim 1024 decoder layers 6 decoder attention heads 8 decoder embed dim 512 decoder ffn embed dim 1024 dropout 0.1 optimizer adam adam-β (0.9, 0.98) clip-norm 1e-7 lr 5e-4 lr scheduler inverse sqrt warmup-updates 4000 warmup-init-lr 1e-7 weight decay 0.0001 label-smoothing 0.1 max tokens 8192. Table 8: Hyper-parameters of Simul PL in our experiments. Lo RA lora r 64 lora alpha 16 lora dropout 0.1 batch size 64 micro batch size 32 learning rate 2e-4 training steps 1000 α 0.1 β 0.1 batch size 64 micro batch size 16 learning rate 2e-6 training steps 400