reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CLImage: Human-Annotated Datasets for Complementary-Label Learning

Authors: Hsiu-Hsuan Wang, Mai Tan Ha, Nai-Xuan Ye, Wei-I Lin, Hsuan-Tien Lin

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our efforts resulted in the creation of four datasets: CLCIFAR10, CLCIFAR20, CLMicro Image Net10, and CLMicro Image Net20, derived from well-known classification datasets CIFAR10, CIFAR100, and Tiny Image Net200. These datasets represent the very first real-world CLL datasets, namely CLImage, which are publicly available at: https://github.com/ntucllab/CLImage_Dataset. Through extensive benchmark experiments, we discovered a notable decrease in performance when transitioning from synthetically labeled datasets to real-world datasets. We investigated the key factors contributing to the decrease with a thorough dataset-level ablation study.
Researcher Affiliation	Academia	Hsiu-Hsuan Wang EMAIL Department of Computer Science and Information Engineering National Taiwan University
Pseudocode	Yes	An algorithmic description of the protocol is as follows. For each image x, 1. Uniformly sample four labels without replacement from the label set [K]. 2. Ask the annotator to select any one of the complementary label y from the four sampled labels. 3. Add the pair (x, y) to the complementary dataset.
Open Source Code	Yes	These datasets represent the very first real-world CLL datasets, namely CLImage, which are publicly available at: https://github.com/ntucllab/CLImage_Dataset.
Open Datasets	Yes	Our efforts resulted in the creation of four datasets: CLCIFAR10, CLCIFAR20, CLMicro Image Net10, and CLMicro Image Net20, derived from well-known classification datasets CIFAR10, CIFAR100, and Tiny Image Net200. These datasets represent the very first real-world CLL datasets, namely CLImage, which are publicly available at: https://github.com/ntucllab/CLImage_Dataset.
Dataset Splits	Yes	The CLCIFAR10 and CLCIFAR20 datasets each contain 50,000 training instances and 10,000 testing instances. For the CLMicro Image Net datasets, CLMicro Image Net10 has 5,000 training instances and 500 testing instances, whereas CLMicro Image Net20 includes 10,000 training instances and 1,000 testing instances. The learning rate was selected from {10-3, 510-4, 10-4, 510-5, 10-5} using a 10% hold-out validation set.
Hardware Specification	Yes	The experiments were run with Tesla V100-SXM2.
Software Dependencies	No	Then, we trained a Res Net18 (He et al., 2016) model using the baseline methods mentioned above on the single CLL dataset using the Adam optimizer for 300 epochs without learning rate scheduling.
Experiment Setup	Yes	Then, we trained a Res Net18 (He et al., 2016) model using the baseline methods mentioned above on the single CLL dataset using the Adam optimizer for 300 epochs without learning rate scheduling. Detailed results from the ablation study on various neural network architectures, which further justify our choice of Res Net18 as the backbone, are available in Appendix A.6. The training settings included a fixed weight decay of 10 4 and a batch size of 512. The experiments were run with Tesla V100-SXM2. For better generalization, we applied standard data augmentation technique, Random Horizontal Flip, Random Crop, and normalization to each image. The learning rate was selected from {10 3, 5 10 4, 10 4, 5 10 5, 10 5} using a 10% hold-out validation set.