SASSL: Enhancing Self-Supervised Learning via Neural Style Transfer
Authors: Renan A. Rojas-Gomez, Karan Singhal, Ali Etemad, Alex Bijamov, Warren Richard Morningstar, Philip Andrew Mansfield
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show improved downstream performance on Image Net Deng et al. (2009) by incorporating SASSL in methods such as Mo Co, BYOL and Sim CLR without hyperparameter tuning (Sections 5.1, 5.4). We show SASSL learns stronger representations by measuring their transfer learning capabilities on various datasets. |
| Researcher Affiliation | Collaboration | Renan A. Rojas-Gomez EMAIL University of Illinois at Urbana-Champaign Karan Singhal EMAIL Google Research Ali Etemad EMAIL Queen s University, Canada Alex Bijamov EMAIL Google Deep Mind Warren R. Morningstar EMAIL Google Deep Mind Philip Andrew Mansfield EMAIL Google Deep Mind |
| Pseudocode | Yes | Algorithm 1: Style transfer augmentation block Input: ic, is, F, T , αmin, αmax, βmin, βmax Output: ics zc F(ic) ; # Style representation of content image zs F(is) ; # Style representation of style image α U(αmin, αmax) ; # Blending factor ˆz (1 α)zc + αzs ; # Feature blending ˆics T (ic, ˆz); β U(βmin, βmax) ; # Interpolation factor ics (1 β)ic + βˆics ; # Stylized image |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We empirically show improved downstream performance on Image Net Deng et al. (2009) by incorporating SASSL in methods such as Mo Co, BYOL and Sim CLR without hyperparameter tuning (Sections 5.1, 5.4). We evaluate its transfer learning performance across various tasks... using the following target datasets: Image Net-1% subset (Chen et al., 2020a), i Naturalist 21 (i Nat21) (i Naturalist 2021), Diabetic Retinopathy Detection (Retinopathy) (Kaggle & Eye Pacs, 2015), Describable Textures Dataset (DTD) (Cimpoi et al., 2014), Food101 (Bossard et al., 2014), CIFAR10/100 (Krizhevsky, 2009), SUN397 (Xiao et al., 2010), Cars (Krause et al., 2013), Caltech-101 (Fei-Fei et al., 2004), and Flowers (Nilsback & Zisserman, 2008). |
| Dataset Splits | Yes | We compare the transfer learning accuracy of Res Net-50 pretrained via Mo Co v2 using SASSL against a Mo Co v2 baseline with default data augmentation. The evaluated models are pretrained on Image Net and transferred to eleven target datasets: Image Net-1% subset (Chen et al., 2020a)... and We compare our Res Net-50 backbone pretrained via SASSL + Mo Co v2 against a Mo Co v2 baseline in the context of one and ten-shot learning on Image Net. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments in the main text. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | We pretrain a Res Net-50 encoder on Image Net for 1, 000 epochs via SASSL. SASSL pretraining applies Style Transfer only to the left view (no changes in augmentation are applied to the right view). It is applied with a probability p = 0.8 using blending and interpolation factors drawn from a uniform distribution α, β U(0.1, 0.3). |