Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data

Authors: Puyudi Yang, Jianbo Chen, Cho-Jui Hsieh, Jane-Ling Wang, Michael I. Jordan

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of these methods using both quantitative metrics and human evaluation on various stateof-the-art models for text classification, including a word-based CNN, a character-based CNN and an LSTM. As an example of our results, we show that the accuracy of character-based convolutional networks drops to the level of random selection by modifying only five characters through Greedy Attack.
Researcher Affiliation Academia Puyudi Yang EMAIL Department of Statistics University of California, Davis Davis, CA 95616, USA; Jianbo Chen EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720-1776, USA; Cho-Jui Hsieh EMAIL Department of Computer Science University of California, Los Angelos Los Angelos, CA 90095, USA; Jane-Ling Wang EMAIL Department of Statistics University of California, Davis Davis, CA 95616, USA; Michael I. Jordan EMAIL Division of Computer Science and Department of Statistics University of California, Berkeley Berkeley, CA 94720-1776, USA
Pseudocode Yes Algorithm 1 Greedy Attack; Algorithm 2 Gumbel Attack
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes IMDB Review (Maas et al., 2011); AG s News (Zhang et al., 2015); Yahoo! Answers (Zhang et al., 2015)
Dataset Splits Yes IMDB Review (Maas et al., 2011) 25,000 Train Samples 25,000 Test Samples; AG s News (Zhang et al., 2015) 120,000 Train Samples 7,600 Test Samples; Yahoo! Answers (Zhang et al., 2015) 1,400,000 Train Samples 60,000 Test Samples.
Hardware Specification Yes All experiments were performed on a single NVidia Tesla k80 GPU, coded in Tensor Flow.
Software Dependencies No The paper mentions 'coded in Tensor Flow' but does not specify a version number for TensorFlow or any other software libraries.
Experiment Setup Yes The model is trained with rmsprop (Hinton et al., 2012) for five epochs. Each review is padded/cut to 400 words. The model achieves accuracy of 90.1% on the test data set. ... The network consists of a 300-dimensional randomly-initialized word embedding, a bidirectional LSTM, each with dimension 256, and a dropout layer as hidden layers. The model is trained with rmsprop (Hinton et al., 2012). ... The identifier and the perturber are trained separately on the training data, by rmsprop (Hinton et al., 2012) with step size 0.001. The temperature τ is fixed to be 0.5 throughout the experiments.