Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift
Authors: Saurabh Garg, Amrith Setlur, Zachary Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan
NeurIPS 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we first undertake a systematic empirical investigation of this combination, finding (i) that in domain adaptation settings, self-training and contrastive learning offer significant complementary gains; and (ii) that in semi-supervised learning settings, surprisingly, the benefits are not synergistic. Across eight distribution shift datasets (e.g., BREEDs, WILDS), we demonstrate that the combined method obtains 3 8% higher accuracy than either approach independently. Finally, we theoretically analyze these techniques in a simplified model of distribution shift demonstrating scenarios under which the features produced by contrastive learning can yield a good initialization for self-training to further amplify gains and achieve optimal performance, even when either method alone would fail. |
| Researcher Affiliation | Academia | Saurabh Garg Carnegie Mellon University EMAIL Amrith Setlur Carnegie Mellon University EMAIL Zachary C. Lipton Carnegie Mellon University EMAIL Sivaraman Balakrishnan Carnegie Mellon University EMAIL Virginia Smith Carnegie Mellon University EMAIL Aditi Raghunathan Carnegie Mellon University EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using existing open-source libraries ("WILDs [70] and RLSbench [30] open source libraries") and refers to an 'official library released with the paper' for Resnet on Cifar, providing a link to a third-party GitHub repository (https://github.com/kuangliu/pytorch-cifar). It does not explicitly state the release of the authors' own implementation code for their proposed methodology. |
| Open Datasets | Yes | For both UDA and SSL, we conduct experiments across eight benchmark datasets: four BREEDs datasets [72] Entity13, Entity30, Nonliving26, Living17; FMo W [47, 18] from WILDS benchmark; Officehome [85]; Visda [64, 63]; and CIFAR-10 [48]. |
| Dataset Splits | Yes | We partition each source and target dataset into 80% and 20% i.i.d. splits. We use 80% splits for training and 20% splits for evaluation (or validation). |
| Hardware Specification | Yes | Our experiments were performed across a combination of Nvidia T4, A6000, and V100 GPUs. |
| Software Dependencies | No | The paper mentions 'pytorch implementation' but does not specify its version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | We summarize the learning rate, batch size, number of epochs, and ℓ2 regularization parameter used in our study in Table 7. |