Adversarial Training Helps Transfer Learning via Better Representations
Authors: Zhun Deng, Linjun Zhang, Kailas Vodrahalli, Kenji Kawaguchi, James Y. Zou
NeurIPS 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We support our theories with experiments on popular data sets and deep learning architectures. Experiments. We perform empirical study of image classification. Our source tasks are image classification on Image Net [45]; our target tasks are image classification on CIFAR-10 [28]. |
| Researcher Affiliation | Academia | Zhun Deng Harvard University Cambridge, MA 02138 EMAIL Linjun Zhang Rutgers University Piscataway, NJ 08854 EMAIL Kailas Vodrahalli Stanford University Stanford, CA 94025 EMAIL Kenji Kawaguchi Harvard University Cambridge, MA 02138 EMAIL James Zou Stanford University Stanford, CA 94025 EMAIL |
| Pseudocode | Yes | Algorithm 1 Learning for Linear Representations Input: {St}T +1 t=1 Step 1: Optimize the loss function on each individual source task t 2 [T] and obtain ˆβt = argminkβtk 1 i hβt, x(t) i y(t) i i/nt. Step 2: ˆW1 top-r SVD of [ˆβ1, ˆβ2, , ˆβT ]. Step 3: ˆw(T +1) 2 argminkw(T +1) 2 k 1 i=1 y(T +1) i hˆw(T +1) 2 , ˆW1x(T +1) i i/n T +1. Return ˆW1, ˆw(T +1) 2 |
| Open Source Code | No | The paper states: 'We use a public library for adversarial training [22].' with the URL 'https://github.com/Madry Lab/robustness.', but does not provide explicit access to the authors' own implementation code for the methodology described in the paper. |
| Open Datasets | Yes | Our source tasks are image classification on Image Net [45]; our target tasks are image classification on CIFAR-10 [28]. |
| Dataset Splits | No | The paper references the use of Image Net and CIFAR-10 datasets and discusses training and testing, but it does not explicitly specify the training/validation/test dataset splits (e.g., percentages or sample counts) beyond using CIFAR-10 as a target task. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'a public library for adversarial training [22]', which is identified as 'Robustness (python library)'. However, no specific version numbers for this library or other key software dependencies (e.g., Python, PyTorch/TensorFlow) are provided. |
| Experiment Setup | Yes | To simulate the pseudo-labeling setup, we sample 10% of Image Net, train a Res Net-18 model on this sample (without adversarial training), and generate pseudo-labels for the remaining 90%. We then train a new source model using all of the source labeled and pseudo-labeled data with and without adversarial training. We use a public library for adversarial training [22]. The high-level approach for adversarial training is as follows: at each iteration, take a small number of gradient steps to generate adversarial examples from an input batch; then update network weights using the loss gradients from the adversarial batch. |