Trip-ROMA: Self-Supervised Learning with Triplets and Random Mappings

Authors: Wenbin Li, Xuesong Yang, Meihao Kong, Lei Wang, Jing Huo, Yang Gao, Jiebo Luo

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments, including unsupervised representation learning and unsupervised few-shot learning, have been conducted on Image Net-1K and seven small datasets. They successfully demonstrate the effectiveness of Trip-ROMA and consistently show that ROMA can further effectively boost other SSL methods.
Researcher Affiliation Academia 1State Key Laboratory for Novel Software Technology, Nanjing University, China 2University of Wollongong, Australia, 3University of Rochester, USA
Pseudocode Yes B Pseudo-code of Trip-ROMA The pseudo-code of the proposed Trip-ROMA is shown in Algorithm 1.
Open Source Code Yes Code is available at https://github.com/Wenbin Lee/Trip-ROMA.
Open Datasets Yes The main experiments are conducted on seven small benchmark datasets, i.e., CIFAR10 (Krizhevsky et al., 2009), CIFAR-100 (Krizhevsky et al., 2009), STL-10 (Coates et al., 2011), Image Net100 (Tian et al., 2019), mini Image Net (Vinyals et al., 2016), CIFAR-100FS (Bharti et al., 2020) and FC100 (Oreshkin et al., 2018), whose details can be found in the appendix.
Dataset Splits Yes CIFAR-10 consists of 60000 32 32 colour images in 10 classes, in which there are 6000 images in each class. Following the original splits, we take 50000 and 10000 images for training (without labels) and test, respectively. CIFAR-100 has a total number of 100 classes, containing 500 training images and 100 test images per class. We take 50000 images ignoring the labels for training, and the remaining images for test. ... CIFAR-100FS is a version for few-shot learning conducted based on the CIFAR-100 dataset (Krizhevsky et al., 2009). It also contains 100 classes like CIFAR-100, where 64, 16 and 20 classes are used for training, validation and test, respectively. FC100 is another version built on CIFAR-100 for FSL with a new splits that is different from CIFAR-100FS. Specifically, 20 high-level super classes are divided into 12, 4 and 4 classes, which are corresponding to 60, 20 and 20 low-level specific classes, for training, validation and test, respectively.
Hardware Specification No Different from the small datasets, as a large-scale dataset, Image Net1K needs more computing resources to put effort into efficiently tuning the hyper-parameters and training tricks. Unfortunately, we are not able to have access to sufficient computing resources.
Software Dependencies No Algorithm 1 Trip-ROMA Pseudocode, PyTorch-like
Experiment Setup Yes Optimization. We use SGD optimizer with a momentum of 0.9 and a cosine decay learning rate schedule to optimize Trip-ROMA. On Image Net-100, we train Trip-ROMA for 200 epochs with the base learning rate of 0.2, batch size of 128 and weight decay of 1e 4. For CIFAR-10, STL-10, CIFAR-100 and mini Image Net, the models are trained for 1000 epochs with the base learning rate of 0.03 and weight decay of 5e 4. The batch size is set to 64 for CIFAR-10 and CIFAR-100, but 128 for STL-10, mini Image Net, CIFAR-100FS and FC100. Additionally, each learning rate is linearly scaled with the batch size. The hyper-parameters λ and τ in Eq. (3) are set to 8 and 0.5, respectively. For the random mapping setting, a linear transformation matrix L with size of 2048 1024 is randomly sampled from a standard normal distribution. Linear evaluation. ...the linear classifier is trained for 100 epochs using a SGD optimizer in a cosine decay schedule, where the base learning rate is 30, weight decay is 0, momentum is 0.9 and batch size is 128.