EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

Authors: Yuqiao Wen, Behzad Shayegh, Chenyang Huang, Yanshuai Cao, Lili Mou

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted experiments on IWSLT (Cettolo et al. 2017) and Europarl (Koehn 2005), two popular multilingual translation datasets for zero-shot machine translation. Results show that EBBS can generate high-quality translations and outperform existing ensemble techniques.
Researcher Affiliation Collaboration Yuqiao Wen1,*, Behzad Shayegh1, , Chenyang Huang1, Yanshuai Cao2, Lili Mou1,3 1Dept. Computing Science, Alberta Machine Intelligence Institute (Amii), University of Alberta 2RBC Borealis 3Canada CIFAR AI Chair, Amii EMAIL, EMAIL, EMAIL EMAIL, EMAIL
Pseudocode Yes We provide the detailed pseudocode for EBBS in Algorithm 1 and an illustration in Figure 1.
Open Source Code Yes Git Hub https://github.com/MANGA-UOFA/EBBS
Open Datasets Yes We evaluated EBBS on two popular benchmark datasets for zero-shot machine translation: IWSLT (Cettolo et al. 2017), which contains 4 languages (with English) and 6 zero-shot directions; and Europarl v7 (Koehn 2005), which contains 9 languages and 56 zero-shot directions.
Dataset Splits No The paper mentions using IWSLT and Europarl datasets and refers to replicating a previous model's training setup (Liu et al. 2021) and standard practice for selecting subsets for distillation (Fan et al. 2021). However, it does not explicitly state the train/validation/test split percentages or sample counts used in this paper.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper mentions the use of a Transformer architecture and a byte pair encoding tokenizer, but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Specifically, the neural architecture in (Liu et al. 2021) is a 5-layer encoder decoder Transformer for IWSLT, but has 8 layers for Europarl to accommodate more training data and languages. For EBBS, we used a beam size of five for both upper- and lower-level beams. In our experiment, we implemented standard beam search for comparison, where we also used a beam size of five, following the common practice (Meister, Cotterell, and Vieira 2020).