reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Coreset-Driven Re-Labeling: Tackling Noisy Annotations with Noise-Free Gradients

Authors: Saumyaranjan Mohanty, Konda Reddy Mopuri

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive evaluation over CIFAR-100N, Web Vision, and Image Net-1K Datasets, we demonstrate that our method outperforms the SOTA coreset selection for re-labeling methods (Divide Mix and SOP+). We have provided the codebase at URL.
Researcher Affiliation	Academia	Saumyaranjan Mohanty EMAIL Department of Artificial Intelligence Indian Institute of Technology Hyderabad Konda Reddy Mopuri EMAIL Department of Artificial Intelligence Indian Institute of Technology Hyderabad
Pseudocode	Yes	Algorithm 1 Modified Noise-free Gradients for Re-labeling algorithm
Open Source Code	Yes	We have provided the codebase at URL.
Open Datasets	Yes	CIFAR-100N (Wei et al., 2022) is the CIFAR-100 dataset with human-annotated real-world noisy labels collected from Amazon Mechanical Turk. This specialised dataset incorporates human-annotated real-world noisy labels. It consists of 50, 000 colour images of dimension 32 32 3 from 100 different classes, each class having 500 images. Web Vision (Li et al., 2017) contains 2.4M images crawled from the Web using the 1, 000 concepts in Image Net-1K (Deng et al., 2009). Similar to prior works (Chen et al., 2019; Park et al., 2023), we use the mini-Web Vision version consisting of the first 50 classes of the Google image subset with approximately 66, 000 training images. Following the approach in Park et al. (2023), we introduced 20% asymmetric noise to the Image Net-1k (Deng et al., 2009) dataset.
Dataset Splits	Yes	Image Net-1K consists of 1000 classes, with 1, 281, 167 training images and 50, 000 validation images.
Hardware Specification	No	No specific hardware details (GPU models, CPU models, etc.) are provided in the paper.
Software Dependencies	No	No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup	Yes	Table 3: Hyper-parameter values used across multiple datasets. Settings CIFAR-100N Web Vision ILSVRC Epochs 300 100 50 Optimizer SGD SGD SGD Momentum 0.9 0.9 0.9 Weight Decay 0.0005 0.0005 0.0005 Batch Size 128 32 64 Learning Rate 0.02 0.02 0.02