reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Speed Master: Quick or Slow Play to Attack Speaker Recognition

Authors: Zhe Ye, Wenjie Zhang, Ying Ren, Xiangui Kang, Diqun Yan, Bin Ma, Shiqi Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comprehensive experiments demonstrate that Speed Master can achieve an ASR over 99% in the digital domain, with only a 0.6% poisoning rate. Additionally, we validate the feasibility of Speed Master in the real world and its resistance to typical defensive measures. Extensive experiments are conducted on two datasets and two models to evaluate our method.
Researcher Affiliation	Academia	1Guangdong Key Lab of Information Security, School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, China 3Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China 4Department of Computer Science, City University of Hong Kong, Hong Kong, China
Pseudocode	No	The paper describes the methodology and training process with figures and textual explanations, but it does not include a dedicated section for pseudocode or an algorithm block.
Open Source Code	No	The paper does not contain any explicit statement about releasing code, nor does it provide a link to a code repository.
Open Datasets	Yes	Moreover, our experiments are conducted on two benchmarks in the field: Vox Celeb1 (Nagrani, Chung, and Zisserman 2017) and Libri Speech (Panayotov et al. 2015).
Dataset Splits	No	The paper mentions selecting a proportion of samples for poisoning based on a 'poisoning rate ρ%' and creating a 'backdoor dataset Db' from poisoned and non-poisoned data. It also states the use of 'benign testing samples' and 'poisoned testing samples' for metrics (BA and ASR). However, it does not provide specific details on the train/test/validation split ratios or counts for the overall datasets (Vox Celeb1 and Libri Speech) used in the experiments.
Hardware Specification	Yes	We performed all experiments on a server running Ubuntu 20.04, equipped with four NVIDIA Ge Force RTX A6000 GPUs, utilizing a single card with 48GB of VRAM for the experiments.
Software Dependencies	Yes	The experiments were conducted using Pytorch version 1.11.0 and Torchaudio version 0.11.0.
Experiment Setup	Yes	For the default attack setting, we select the 100 as the target label and set the poisoning rate for the tempo method as 2% and other attacks as 0.6%. For our method, we use 0.8 as the default speed rate. During training, we incorporated room impulse response (RIR) and noise for trigger enhancement to ensure robustness in real-world conditions.