reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Code-switching Mediated Sentence-level Semantic Learning

Authors: Shuai Zhang, Jiangyan Yi, Zhengqi Wen, Jianhua Tao, Feihu Che, Jinyang Wu, Ruibo Fu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we conduct thorough experiments on speech recognition, speech translation, and language modeling tasks. The experimental results fully demonstrate that the proposed method can widely improve the performance of code-switching related tasks.
Researcher Affiliation	Academia	Shuai Zhang1, Jiangyan Yi1, Zhengqi Wen1, Jianhua Tao 1, Feihu Che 1, Jinyang Wu1, Ruibo Fu2 1Department of Automation & BNRist, Tsinghua University 2Institute of Automation, Chinese Academy of Sciences EMAIL, EMAIL
Pseudocode	No	The paper describes its methodology using textual descriptions, mathematical equations, and architectural diagrams (Figure 2), but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing code, nor does it include links to a code repository or mention code in supplementary materials.
Open Datasets	Yes	We conduct our experiments on three popular publicly available datasets, including the ASRU 2019 Mandarin-English code-switching challenge dataset (Shi, Feng, and Xie 2020), Fisher dataset (Cieri, Miller, and Walker 2004) and TED English-Chinese dataset (Liu et al. 2019).
Dataset Splits	Yes	Statistical information on the code-switching dataset is shown in Table 1. ... The Fisher data consists of three evaluation sets (Dev/Dev2/Test) that together contain approximately a thousand instances of code-switching with corresponding translations in monolingual English. We therefore combined all the code-switching data from the three evaluation sets as a test set.
Hardware Specification	Yes	We use Adam optimizer with β1 = 0.9, β2 = 0.998, ϵ = 1e 8 on 4 NVIDIA A100 GPUs.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'transformer architecture' and 'Llama 3 70B' for data processing, but does not provide specific version numbers for these or other software libraries/frameworks used for the implementation of their models.
Experiment Setup	Yes	The attention dimensions of the encoder and decoder are both 512 and the number of the head is 4. The dimension of position-wise feed-forward networks is 1024. The number of acoustic encoder blocks and decoder blocks are 12 and 6 respectively. To avoid over-fitting, the unified label smoothing technique is used, and the parameter is set to 0.1. Spec Augment with frequency masking (F=30, m F=2) and time masking (T=40, m T=2) is used to improve the performance of the models (Park et al. 2019). Meanwhile, we set the residual dropout as 0.1, where the residual dropout is applied to each sub-block before adding the residual information. We use Adam optimizer with β1 = 0.9, β2 = 0.998, ϵ = 1e 8 on 4 NVIDIA A100 GPUs. The batch size is set to 128 during the training process. The learning rate is set by a warm-up strategy. We perform decoding using beam search with a beam size of 10. ... when α is set to 0.7 and β is set to 0.1, both ASR and AST tasks can achieve satisfactory results. All subsequent experiments use these parameter settings.