reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Contrastive Attraction and Contrastive Repulsion for Representation Learning

Authors: Huangjie Zheng, Xu Chen, Jiangchao Yao, Hongxia Yang, Chunyuan Li, Ya Zhang, Hao Zhang, Ivor Tsang, Jingren Zhou, Mingyuan Zhou

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With our extensive experiments, CACR not only demonstrates good performance on CL benchmarks, but also shows better robustness when generalized on imbalanced image datasets. Code and pre-trained checkpoints are available at https://github.com/Jeg Zheng/CACR-SSL. Our theoretical analysis reveals that CACR generalizes CL s behavior by positive attraction and negative repulsion, and it further considers the intra-contrastive relation within the positive and negative pairs to narrow the gap between the sampled and true distribution, which is important when datasets are less curated. Our experiments demonstrate the eﬀectiveness of CACR in a variety of standard CL settings, with both convolutional and transformer-based architectures on various benchmark datasets.
Researcher Affiliation	Collaboration	Huangjie Zheng EMAIL Department of Statistics and Data Science The University of Texas at Austin; Xu Chen EMAIL Shanghai Jiao Tong University Alibaba Group; Jiangchao Yao EMAIL Cooperative Medianet Innovation Center, Shanghai Jiao Tong University Shanghai AI Laboratory; Hongxia Yang EMAIL Shanghai Institute for Advanced Study of Zhejiang University (SIAS); Chunyuan Li EMAIL Microsoft Research, Redmond; Ya Zhang EMAIL Cooperative Medianet Innovation Center, Shanghai Jiao Tong University Shanghai AI Laboratory; Hao Zhang EMAIL Xidian University; Ivor Tsang EMAIL A STAR Centre for Frontier AI Research (CFAR); Jingren Zhou EMAIL Alibaba Group; Mingyuan Zhou EMAIL Mc Combs School of Business The University of Texas at Austin
Pseudocode	Yes	Algorithm 1 Py Torch-like Augmentation Code on CIFAR-10, CIFAR-100 and STL-10; Algorithm 2 Py Torch-like Augmentation Code on Image Net-100 and Image Net-1K; Algorithm 3 Py Torch-like style pseudo-code of CACR with Mo Co-v2 at each iteration.
Open Source Code	Yes	Code and pre-trained checkpoints are available at https://github.com/Jeg Zheng/CACR-SSL.
Open Datasets	Yes	In this section, we ﬁrst study the CACR behaviors with small-scale experiments, where we use CIFAR-10, CIFAR-100 (Hinton, 2007) and create two class-imbalanced CIFAR datasets as empirical veriﬁcation of our theoretical analysis. For large-scale datasets, we use Image Net-1K (Deng et al., 2009) and compare with the state-of-the-art frameworks (He et al., 2020; Zbontar et al., 2021; Chen et al., 2020a; Caron et al., 2020; Grill et al., 2020; Huynh et al., 2020) on linear probing, where we report the Top-1 validation accuracy on Image Net-1K data. To further justify our analysis, we also leverage two large-scale but label-imbalanced datasets (Webvision v1 and Image Net-22K) for linear probing pretraining.
Dataset Splits	Yes	For evaluation we keep the standard validation/testing datasets. The linear classiﬁer is trained on Image Net-1K on top of ﬁxed representations of the pretrained Res Net50 encoder. The model is tuned on the train2017 set and evaluate on val2017 set
Hardware Specification	Yes	On small-scale datasets, all experiments are conducted on a single GPU, including NVIDIA 1080 Ti and RTX 3090; on large-scale datasets, all experiments are done on 8 Tesla-V100-32G GPUs. Table 14: GPU time (s) per iteration of CACR w.r.t. diﬀerent K on CIFAR-10 with Alex Net framework (mini-batch size is 128), tested on Tesla-v100 GPU. Table 17: GPU time (s) per iteration of diﬀerent loss on Mo Cov2 framework, tested on 32G-V100 GPU
Software Dependencies	No	The paper includes 'PyTorch-like' pseudocode in Algorithms 1, 2, and 3, and mentions 'detectron2 (Wu et al., 2019)' for object detection and segmentation. However, no specific version numbers are provided for PyTorch, torchvision, or detectron2.
Experiment Setup	Yes	We apply the mini-batch SGD with 0.9 momentum and 1e-4 weight decay. The learning rate is linearly scaled as 0.12 per 256 batch size (Goyal et al., 2017). The optimization is done over 200 epochs, and the learning rate is decayed by a factor of 0.1 at epoch 155, 170, and 185. Speciﬁcally, the temperature parameter of CL is τ = 0.19, the hyper-parameters of AU-CL are t = 2.0, τ = 0.19, and the hyper-parameter of HN-CL are τ = 0.5, β = 1.0 , which shows the best performance according to our tuning. For CACR, in both single and multi-positive sample settings, we set t+ = 1.0 for all small-scale datasets. As for t , for CACR (K = 1), t is 2.0, 3.0, and 3.0 on CIFAR-10,CIFAR100, and STL-10, respectively. For CACR (K = 4), t is 0.9, 2.0, and 2.0 on CIFAR-10, CIFAR100, and STL-10, respectively.