reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning

Authors: Bokun Wang, Yunwen Lei, Yiming Ying, Tianbao Yang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically compare our algorithm to representative baselines on the contrastive image-language pretraining task. Experimental results on the CC3M and CC12M datasets demonstrate the superior overall performance of our algorithm.
Researcher Affiliation	Academia	1Texas A&M University 2University of Hong Kong 3University of Sydney
Pseudocode	Yes	Algorithm 1 NUCLR Algorithm for Self-Supervised Representation Learning
Open Source Code	Yes	Our code is available at https://github.com/bokun-wang/NUCLR.
Open Datasets	Yes	In our experiments, we apply our algorithm to bimodal self-supervised representation learning on the Conceptual Captions (CC3M) (Sharma et al., 2018) and Conceptual 12M (CC12M) (Changpinyo et al., 2021) datasets. ... Retrieval performance is evaluated on the test splits of the Flickr30k (Plummer et al., 2015) and MSCOCO (Lin et al., 2014) datasets, ... The top-1 classification accuracy is evaluated on the CIFAR100 (Krizhevsky et al., 2009), Image Net1k (Russakovsky et al., 2015), and Image Net-R (Hendrycks et al., 2021) datasets.
Dataset Splits	Yes	Retrieval performance is evaluated on the test splits of the Flickr30k (Plummer et al., 2015) and MSCOCO (Lin et al., 2014) datasets, in terms of the average Recall@1 score of image-to-text and text-to-image retrievals. The top-1 classification accuracy is evaluated on the CIFAR100 (Krizhevsky et al., 2009), Image Net1k (Russakovsky et al., 2015), and Image Net-R (Hendrycks et al., 2021) datasets.
Hardware Specification	Yes	All experiments utilize distributed data-parallel (DDP) training on two NVIDIA A100 GPUs with 40GB memory and the total batch size B in each iteration is 512.
Software Dependencies	No	The paper mentions "Adam W (Loshchilov and Hutter, 2017)" as the optimizer and "ResNet-50 as the vision encoder and Distil Bert as the text encoder". It also refers to adapting implementations from the "Open CLIP repository". However, it does not provide specific version numbers for any software libraries, programming languages, or environments.
Experiment Setup	Yes	All experiments utilize distributed data-parallel (DDP) training on two NVIDIA A100 GPUs with 40GB memory and the total batch size B in each iteration is 512. Besides, we use Res Net-50 as the vision encoder and Distil Bert as the text encoder. ... We run each algorithm 3 times with different random seeds and each run contains 30 epochs. Hyperparameters of all algorithms are tuned based on the validation performance. The optimizer for the model parameter w is Adam W (Loshchilov and Hutter, 2017) with a weight decay of 0.02 and a cosine learning rate schedule (Loshchilov and Hutter, 2016). For all algorithms, we choose a fixed temperature parameter τ tuned within {0.01,0.03,0.05,0.07}. For Sog CLR and NUCLR, we set γ = 0.8 as in the Sog CLR paper (Yuan et al., 2022). For our NUCLR, we select ζ0 = 0.05 on the CC3M dataset and ζ0 = 0 on the CC12M dataset. Besides, we freeze ζ in the first 5 epochs and update ζ by the SGDm optimizer with a cosine learning rate schedule.