Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks

Authors: Hung Quang Nguyen, Hieu Nguyen, Anh Ta, Thanh Nguyen-Tang, Kok-Seng Wong, Thanh Tung Hoang, Khoa Doan

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on multiple benchmark datasets illustrate the effectiveness of our strategies in improving clean-label backdoor attacks. Our implementation is available here.
Researcher Affiliation Academia 1College of Engineering and Computer Science, Vin University 2Vin Uni-Illinois Smart Health Center 3CSIRO s Data61 4Johns Hopkins University 5VNU University of Engineering and Technology
Pseudocode Yes Algorithm 1 Selection with pre-trained model
Open Source Code No Our implementation is available here.
Open Datasets Yes Dataset. We consider two widely used benchmark datasets: CIFAR10 Krizhevsky et al. (2010) and GTSRB Stallkamp et al. (2012). For OOD strategy, we train the surrogate model on Tiny Imagenet Le & Yang. We also consider Pub Fig 1, a dataset that consists of public figures faces.
Dataset Splits Yes CIFAR10 Krizhevsky et al. (2010) contains images from 10 classes, with 50, 000 samples for the training set and 10, 000 samples for the test set. We poison class 0, which has 5, 000 images. GTSRB Stallkamp et al. (2012) contains images from 43 classes of traffic sign images, including 39, 209 samples for training and 12, 630 samples for test. We also consider Pub Fig 1, a dataset that consists of public figures faces. We select 50 classes with the highest number of images and divide them into 5, 212 images for training and 1, 312 images for validation.
Hardware Specification No The paper mentions models like Res Net18, VGG19, Res Net50, but does not provide any specific hardware details like GPU/CPU models, memory, or cloud infrastructure used for running experiments.
Software Dependencies No The paper mentions 'SGD optimizer' and 'cosine scheduler' but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) that would be needed to replicate the experiments.
Experiment Setup Yes We train Res Net18 and VGG19 for 300 epochs with SGD optimizer, learning rate 0.01, and cosine scheduler. For CIFAR10 and GTSRB, the image size is 32 32. For Pub Fig, we resize the input to 224 224.