Robustness through Data Augmentation Loss Consistency
Authors: Tianjian Huang, Shaunak Ashish Halbe, Chinnadhurai Sankar, Pooyan Amini, Satwik Kottur, Alborz Geramifard, Meisam Razaviyayn, Ahmad Beirami
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that DAIR consistently outperforms ERM and DA-ERM with little marginal computational cost and sets new state-of-the-art results in several benchmarks involving covariant data augmentation. We apply DAIR to real-world learning problems involving covariant data augmentation: robust neural task-oriented dialog state tracking and robust visual question answering. We also apply DAIR to tasks involving invariant data augmentation: robust regression, robust classification against adversarial attacks, and robust Image Net classification under distribution shift. |
| Researcher Affiliation | Collaboration | Tianjian Huang EMAIL University of Southern California Shaunak Halbe EMAIL Georgia Institute of Technology Chinnadhurai Sankar EMAIL Meta AI Pooyan Amini EMAIL Meta AI Satwik Kottur EMAIL Meta AI Alborz Geramifard EMAIL Meta AI Meisam Razaviyayn EMAIL University of Southern California Ahmad Beirami EMAIL Google Research |
| Pseudocode | Yes | Algorithm 1 Training Neural Networks with GD 1: Input: Number of steps T, Training set S, Learning Rate η, Initialized Parameter θ0 2: for t = 1, 2, . . . , T do 3: Compute θ b E f DAIR,R,λ(zi, ezi; θt). 4: Set θt+1 = θt η θ b E f DAIR,R,λ(zi, ezi; θt). 5: end for |
| Open Source Code | Yes | Our code of all experiments are available at: https://github.com/optimization-for-data-driven-science/DAIR. |
| Open Datasets | Yes | We apply DAIR to real-world learning problems involving covariant data augmentation: robust neural task-oriented dialog state tracking and robust visual question answering. We also apply DAIR to tasks involving invariant data augmentation: robust regression, robust classification against adversarial attacks, and robust Image Net classification under distribution shift. ... Among task-oriented dialog datasets, Multi WOZ (Budzianowski et al., 2018) has gained the most popularity ... In this paper, we focus on the Invariant and Covariant VQA (IV/CV-VQA) dataset which contains semantically edited images corresponding to a subset from VQA v2 (Goyal et al., 2017). ... Colored MNIST (Arjovsky et al., 2019) is a binary classification task built on the MNSIT dataset. ... Rotated MNIST (Ghifary et al., 2015) is a dataset where MNIST digits are rotated. ... We conduct our experiments on CIFAR-10 dataset ... In this experiment, we consider a regression task to minimize the root mean square error (RMSE) of the predicted values on samples from the Drug Discovery dataset. ... Image Net-9 Background Challenge (Xiao et al., 2020) was proposed to test the background robustness of image classification models. |
| Dataset Splits | Yes | For the VQA v2 dataset, we use the original VQA v2 train split for training, along with the IV-VQA and CV-VQA train splits for augmentation in the DAIR and DA-ERM(Agarwal et al., 2020) settings. ... We train a model consisted of three convolutional layers and two fully connected layers with 20,000 examples. For each dataset we are defining several different schemes on how the dataset could be modified: Table 10 (Colored MNIST) and Table 11 (Rotated MNIST). ... We evaluate the performance of each algorithm against PGD attacks as well as the clean (no attack-free) accuracy. In our approach, the augmented examples ez can be generated by a certain strong attack, such as Projected Gradient Descent (PGD) or CW (Carlini & Wagner, 2017).We conduct our experiments on CIFAR-10 dataset and compare our approach with several other state-of-the-art baselines. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software like Parl AI (Miller et al., 2017) and BART (Lewis et al., 2019) and Torchvision, but does not provide specific version numbers for these components, which are required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | For training we follow a two stage schedule with a learning rate of 0.005 for the first 20 epochs and a learning rate of 0.0005 for the next 20. We choose a batch size of 64 for all experiments. (Table 9: Training parameter of MNIST experiments). All the methods are trained for 40 epochs with a learning rate of 0.001 and a batch size of 48. (Section I: Setup and additional results for Visual Question Answering). For training the DAIR model, ... We train the model for 120 epochs with initial step size 0.0001 and uses Cosine Annealing scheduler. (Section K.1: Setups for the main results in Section 4.4). We train the model for 175 epochs with batchsize 128, initial learning rate of 0.1 and decay of 0.1 at 30, 70, 110, 150 epochs. (Section L: Details on training Image Net-9) |