reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Consistency Aware Robust Learning under Noisy Labels

Authors: Fahad Sarfraz, Bahram Zonooz, Elahe Arani

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation shows that CARo L achieves high precision in noisy label detection, enhances robustness, and performs reliably under severe noise, highlighting the potential of biologically inspired approaches for robust learning. ... Through comprehensive experiments on CIFAR-10/100, Tiny-Image Net, and real-world noisy datasets (Web-Aircraft, Web-Bird, Web-Car), we demonstrate that CARo L effectively mitigates memorization and effectively learns under severe noise and in complex real-world scenarios. ... We evaluate our method across diverse simulated and real-world noisy datasets with varying noise levels. ... To gain insights into the learning mechanisms in CARo L, we conduct a stepwise ablation, incrementally adding each objective function to the standard cross-entropy (CE) loss and evaluating its impact on model performance. Table 4 shows that each component contributes positively, with their effect becoming more pronounced at higher noise levels.
Researcher Affiliation	Collaboration	Fahad Sarfraz EMAIL Tom Tom, Netherlands Eindhoven University of Technology (TU/e), Netherlands Bahram Zonooz EMAIL Eindhoven University of Technology (TU/e), Netherlands Elahe Arani EMAIL Wayve Technologies Ltd, London, United Kingdom Eindhoven University of Technology (TU/e), Netherlands
Pseudocode	Yes	Algorithm 1 Consistency-Aware Robust Learning (CARo L) Algorithm
Open Source Code	Yes	1Code is available at https://github.com/Neur AI-Lab/CARo L
Open Datasets	Yes	We evaluate our method across diverse simulated and real-world noisy datasets with varying noise levels. Following prior works (Karim et al., 2022; Li et al., 2020), we introduce symmetric noise on CIFAR-10, CIFAR-100, and Tiny Image Net by randomly replacing labels with uniformly sampled incorrect labels. ... To assess real-world applicability, we evaluate on three fine-grained datasets: Web-Aircraft, Web-Bird, and Web-Car (Sun et al., 2021b).
Dataset Splits	Yes	A small clean validation set (5% of the training data) is used to finetune the hyperparameters with the best accuracy as the metric of selection.
Hardware Specification	No	The paper does not explicitly describe the hardware (e.g., GPU/CPU models) used to run its experiments.
Software Dependencies	No	The paper mentions software components and models like "Pre Act Res Net-18", "SGD Optimizer", "CIFAR10 Auto Augment policy", and "Image Net policy", but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup	Yes	Unless otherwise stated, we use Pre Act Res Net-18 (He et al., 2016), SGD Optimizer with 0.9 momentum, 5e-4 weight decay, and an initial lr rate of 0.02 with cosine annealing lr scheduler, 64 batch size, and apply random crop and horizontal flip as weak augmentation. ... The selected hyperparameters are listed in Table 5