reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Global Contrastive Learning for Long-Tailed Classification

Authors: Thong Bach, Anh Tong, Truong Son Hy, Vu Nguyen, Thanh Nguyen-Tang

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on CIFAR-10-LT and CIFAR-100-LT .1The original CIFAR-10 and CIFAR-100 contain 50,000 32x32 images for the training and 10,000 32x32 images for the testing...The result in Table 5.2 shows that our model can outperform the current state-of-the-art model on all datasets across a range of imbalance factors by a large margin.
Researcher Affiliation	Collaboration	Anh Tong Korea Advanced Institute of Science & Technology EMAIL; Truong Son Hy Indiana State University Truong EMAIL; Vu Nguyen Amazon EMAIL; Thanh Nguyen-Tang Johns Hopkins University EMAIL
Pseudocode	No	The paper describes methods like k-global positive selection and prototype learning textually (e.g., "Our algorithm is as follows: we first use coreset generation algorithm...") but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	Our code is available at Co Glo AT_Pro Co. This statement is ambiguous as it does not provide a direct link to a repository, nor does it specify that the code is available in supplementary materials or appendices. 'Co Glo AT_Pro Co' appears to be a project name rather than a concrete access point.
Open Datasets	Yes	CIFAR-10-LT and CIFAR-100-LT For small-scale data, we conduct experiments on CIFAR-10-LT and CIFAR-100-LT .1The original CIFAR-10 and CIFAR-100... Image Net-LT This is a long-tailed version of Image Net dataset (Deng et al., 2009)...
Dataset Splits	Yes	The original CIFAR-10 and CIFAR-100 contain 50,000 32x32 images for the training and 10,000 32x32 images for the testing... The imbalance training data is defined through the imbalance ratio p = max(ni)/min(ni) between the most frequent and least frequent size classes. Long-tailed imbalance follows the exponential decay in sample size across classes. In this paper, we run the experiment under p {10, 50, 100} ... the test set of Image Net-LT is the same as its original test version.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or other computing specifications used for experiments. It mentions 'Due to our resource constraints' in a footnote, but no actual hardware is listed.
Software Dependencies	No	The paper mentions using 'Mocov2' and 'Res Net32'/'Res Net-50' as backbone models, but does not provide specific version numbers for any software, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	For CIFAR data, we use batch size 256, initial learning rate 0.1, SGD optimizer with momentum 0.9, and we train the Res Net32 as the backbone models for 1,000 epochs. After pre-training on our framework, we fine-tuned the top-classifier layer of the pre-trained model with LDAM (Cao et al., 2019) loss... Both of them use the same learning rate equal to 0.1 and a batch size equal to 256. ... For Image Net-LT, its training strategy is similar to CIFAR data, we just change the backbone from Res Net-32 to Res Net-50, this model is trained on 200 epochs before fine-tuning the classifier layer with an additional 100 epochs.