Domain Invariant Adversarial Learning

Authors: Matan Levi, Idan Attias, Aryeh Kontorovich

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a comprehensive battery of experiments on MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), CIFAR-10 (Krizhevsky et al., 2009) and CIFAR-100 (Krizhevsky et al., 2009) datasets, we demonstrate that by enforcing domain-invariant representation learning using DANN simultaneously with adversarial training, we gain a significant and consistent improvement in both robustness and natural accuracy compared to other state-of-the-art adversarial training methods...4 Experiments In this section we conduct comprehensive experiments to emphasise the effectiveness of DIAL, including evaluations under white-box and black-box settings, robustness to unforeseen adversaries, robustness to unforeseen corruptions, transfer learning, and ablation studies.
Researcher Affiliation Academia Matan Levi EMAIL Department of Computer Science Ben-Gurion University of the Negev Idan Attias EMAIL Department of Computer Science Ben-Gurion University of the Negev Aryeh Kontorovich EMAIL Department of Computer Science Ben-Gurion University of the Negev
Pseudocode Yes Algorithm 1 Domain Invariant Adversarial Learning Input: Source data S = {(xi, yi)}n i=1 and network architecture Gf, Gy, Gd Parameters: Batch size m, perturbation size ϵ, pgd attack step size τ, adversarial trade-off λ, initial reversal ratio r, and step size α Init: Y0 and Y1 source and target domain vectors filled with 0 and 1 respectively Output: Robust network G = (Gf, Gy, Gd) parameterized by ˆθ = (θf, θy, θd) respectively repeat Fetch mini-batch Xs = {xj}m j=1, Ys = {yj}m j=1 # Generate adversarial target domain batch Xt for j = 1, . . . , m (in parallel) do x j PGD(xj, yj, ϵ, τ) Xt Xt + x j end for ℓy s, ℓy t CE(Gy(Gf(Xs)), Ys), CE(Gy(Gf(Xt)), Ys) ℓd s, ℓd t CE(Gd(Gf(Xs)), Y0), CE(Gd(Gf(Xt)), Y1) ℓ ℓy s + λℓy t r(ℓd s + ℓd t ) ˆθ ˆθ α ˆθ(ℓ) until stopping criterion is not met A Domain Invariant Adversarial Learning Algorithm Algorithm 1 describes a pseudo-code of our proposed DIALCE variant.
Open Source Code Yes Our source code is available at https://github.com/matanle51/DIAL
Open Datasets Yes In a comprehensive battery of experiments on MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), CIFAR-10 (Krizhevsky et al., 2009) and CIFAR-100 (Krizhevsky et al., 2009) datasets, we demonstrate that by enforcing domain-invariant representation learning using DANN simultaneously with adversarial training...
Dataset Splits Yes In a comprehensive battery of experiments on MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), CIFAR-10 (Krizhevsky et al., 2009) and CIFAR-100 (Krizhevsky et al., 2009) datasets, we demonstrate that by enforcing domain-invariant representation learning using DANN simultaneously with adversarial training.... Figure 5: t-SNE embedding of model output (logits) into two-dimensional space for DIAL and TRADES using the CIFAR-10 natural test data and the corresponding PGD20 generated adversarial examples.
Hardware Specification No No specific hardware details (like GPU models, CPU types, or cloud instances) are provided in the paper for running the experiments.
Software Dependencies No The paper mentions using Foolbox (Rauber et al., 2017) for attacks, but does not provide specific version numbers for software libraries, frameworks, or programming languages used in their implementation.
Experiment Setup Yes We used the Pre Act Res Net-18 (He et al., 2016) architecture on which we integrate a domain classification layer. The adversarial training is done using 10-step PGD adversary with perturbation size of 0.031 and a step size of 0.003 for SVHN and 0.007 for CIFAR-100. The batch size is 128, weight decay is 7e 4 and the model is trained for 100 epochs. For SVHN, the initial learinnig rate is set to 0.01 and decays by a factor of 10 after 55, 75 and 90 iteration. For CIFAR-100, the initial learning rate is set to 0.1 and decays by a factor of 10 after 75 and 90 iterations. For DIALKL, we balance the robust loss with λ = 6 and the domains loss with r = 4. For DIALCE, we balance the robust loss with λ = 1 and the domains loss with r = 2. Furthermore, all methods are trained using SGD with momentum 0.9.