3SAT: A Simple Self-Supervised Adversarial Training Framework

Authors: Jiang Fang, Haonan He, Jiyan Sun, Jiadong Fu, Zhaorui Guo, Yinlong Liu, Wei Ma

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that 3SAT surpasses the known SOTA self-AT methods across all evaluation metrics on various datasets. Notably, on CIFAR-10, 3SAT improves the robust accuracy of the sota self-AT method by 16.19% and the standard accuracy by 11.41%.
Researcher Affiliation Academia 1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China EMAIL
Pseudocode No The paper describes methods using narrative text and mathematical equations (e.g., Equation 1, 2, 3, 4, 5, 6) but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code is at https://github.com/Meng Nan Fang/3SAT
Open Datasets Yes We evaluate the representation performance and robustness of 3SAT on different datasets: CIFAR-10, CIFAR-100 (Krizhevsky 2009), and STL-10 (Coates, Ng, and Lee 2011).
Dataset Splits No The paper mentions evaluating on CIFAR-10, CIFAR-100, and STL-10 datasets but does not explicitly state the training/validation/test splits used for these datasets within the paper's text.
Hardware Specification Yes We evaluated the total pre-training duration of 3SAT versus other competing self-AT methods on a single RTX3090 GPU.
Software Dependencies No 3SAT is built upon the BYOL (Grill et al. 2020) training script implemented by solo-learn(da Costa et al. 2022), and we strictly adhere to the settings in solo-learn for all optimizer configurations, augmentations, and projection head structures. However, specific version numbers for solo-learn or other key software dependencies are not provided.
Experiment Setup Yes We chose 256 as the batch size and performed 1000 epochs of pre-training. On the CIFAR-10 with STL-10 dataset the warm-up parameter W is set to 0, and on the CIFAR-100 dataset the warm-up parameter W is set to 200. To generate adversarial perturbations for adversarial training, we used the ℓ PGD attack (Madry et al. 2017) and followed all hyperparameters used in Dyn ACL (Luo, Wang, and Wang 2023). To speed up convergence, we only ran 5 steps of PGD in the pre-training stage. Under all evaluation methods, we only perform finetuning for 25 epochs.