Enhancing Transferability of Audio Adversarial Example for Both Frequency- and Time-domain
Authors: Zilin Tian, Yunfei Long, Liguo Zhang, Jiahong Zhao
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive evaluations on diverse datasets consistently demonstrate that AIE outperforms existing methods, establishing its effectiveness in enhancing adversarial transferability across domains. |
| Researcher Affiliation | Academia | 1Harbin Engineering University 2University of Southampton EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 AIE with MI-FGSM attack Input: Surrogate models fw, fs; A natural audio example x with label y Parameter: The perturbation magnitude ϵ; the number of iteration T; the decay factor µ; the hyper-parameter k Output: An adversarial example xadv 1: Initialize: α = ϵ/T; M0 = 0; xadv 0 = x; 2: for t = 1 to T do 3: Initialize set η = 0.5 4: # Calculate discrepancy ratio between the individual potential outputs and the ensemble potential output 5: Calculate the individual potential outputs ps and pw using Eq. 8 6: Calculate the ensemble potential output p using Eq. 9 7: Calculate the discrepancy ratio ρ = cos(ps,p) cos(pw,p) 8: # Adaptively adjust the domain weights based on the discrepancy ratio 9: Update the domain weight η using Eq. 10 10: Calculate the inter-domain ensemble loss L E(xadv, η), y with updated η 11: # Update momentum using the gradient of interdomain ensemble loss 12: Get Mt+1 = µMt + xadv t L(xadv t ,η) xadv t L(xadv t ,η) 1 13: # Update adversarial example 14: xadv t+1 = Q Bϵ(x) xadv t + α sign (Mt+1) 15: end for 16: return xadv = xadv T . |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for the described methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | To comprehensively evaluate the effectiveness of the proposed method, we conduct extensive experiments on two widely recognized datasets for audio classification tasks: Urban Sound8k [Salamon et al., 2014] for environmental sound classification and Ships Ear [Santos-Dom ınguez et al., 2016] for underwater acoustic target identification. |
| Dataset Splits | No | The paper states: "From each dataset, we randomly select 1000 clean audio examples, ensuring that each is correctly classified by all evaluated models and preventing data overlap." This describes the selection of examples for evaluation, not the training/validation/test splits used to train the models themselves. Specific split percentages or counts for training the models are not provided. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | Hyper-parameters. We empirically set the maximum perturbation to 0.01 (l = 0.01), the number of iterations T = 10, the step size α = 0.002. For MI and NI, we set the decay factor µ = 1.0. For VMI, we set the number of sampled examples N = 20 and the upper bound of neighborhood size β = 1.5 ϵ. For EMI, we set the number of sampled examples to 11, the sampling interval bound to 7, and adopt the linear sampling. The inner update time in SVRE is set to be four times the number of models. The tolerance threshold and temperature coefficient in Ada Ea are set to be 0.3 and 10. |