On the Out-of-Distribution Generalization of Self-Supervised Learning

Authors: Wenwen Qiang, Jingyao Wang, Zeen Song, Jiangmeng Li, Changwen Zheng

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we first introduce the datasets used in experiments. Next, we evaluate our method1 on multiple tasks, including unsupervised learning, semi-supervised learning, transfer learning, and few-shot learning. We introduce the experimental setups in the corresponding sections. Finally, we perform ablation studies. All results reported are the averages of five runs performed on NVIDIA RTX 4090 GPUs.
Researcher Affiliation Academia 1Institute of Software Chinese Academy of Sciences, Beijing, China 2University of the Chinese Academy of Sciences, Beijing, China. Correspondence to: Jiangmeng Li <EMAIL>.
Pseudocode Yes Algorithm 1 Proposed Mini-Batch Sampling Strategy
Open Source Code Yes 1Codes of the proposed Sampling Strategy can be found in https://github.com/ML-TASA/PID-SSL
Open Datasets Yes For unsupervised learning, we select Image Net-100 (Tian et al., 2020) and Image Net (Deng et al., 2009). For semisupervised learning, we select Image Net (Deng et al., 2009) for evaluation. For transfer learning, we select PASCAL VOC (Everingham et al., 2010) and COCO (Lin et al., 2014). For few-shot learning, we evaluate the proposed method on Omniglot (Lake et al., 2019), mini Image Net (Vinyals et al., 2016), and CIFAR-FS (Bertinetto et al., 2018).
Dataset Splits Yes In accordance with the standard protocol (Zbontar et al., 2021), we create two balanced subsets by sampling 1% and 10% of the training dataset. Specifically, we use the Image Net dataset, a large-scale benchmark for visual recognition tasks, comprising 1.2 million images in 1,000 categories. The subsets contain 1% and 10% of the labeled training data, which are used for fine-tuning the model.
Hardware Specification Yes All results reported are the averages of five runs performed on NVIDIA RTX 4090 GPUs.
Software Dependencies No The paper does not explicitly mention specific software dependencies with version numbers.
Experiment Setup Yes Experimental setup. Our proposed sampling strategy is compatible with any D-SSL or G-SSL model. In the standard training procedure of SSL, a mini-batch is randomly sampled from the training data before each iteration. In contrast, our method replaces this random sampling step with a structured mini-batch construction process defined by Algorithm 1. Specifically, our approach integrates seamlessly into existing SSL frameworks by substituting the mini-batch sampling component with Algorithm 1, while leaving all other aspects of the SSL training pipeline unchanged. As a result, the overall training procedure and hyperparameter settings remain identical to those used in the baseline methods. Therefore, for all our experiments, we retain the original hyperparameter configurations to ensure a fair and consistent comparison.