Consistency Aware Robust Learning under Noisy Labels

Authors: Fahad Sarfraz, Bahram Zonooz, Elahe Arani

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical evaluation shows that CARo L achieves high precision in noisy label detection, enhances robustness, and performs reliably under severe noise, highlighting the potential of biologically inspired approaches for robust learning. ... Through comprehensive experiments on CIFAR-10/100, Tiny-Image Net, and real-world noisy datasets (Web-Aircraft, Web-Bird, Web-Car), we demonstrate that CARo L effectively mitigates memorization and effectively learns under severe noise and in complex real-world scenarios. ... We evaluate our method across diverse simulated and real-world noisy datasets with varying noise levels. ... To gain insights into the learning mechanisms in CARo L, we conduct a stepwise ablation, incrementally adding each objective function to the standard cross-entropy (CE) loss and evaluating its impact on model performance. Table 4 shows that each component contributes positively, with their effect becoming more pronounced at higher noise levels.
Researcher Affiliation Collaboration Fahad Sarfraz EMAIL Tom Tom, Netherlands Eindhoven University of Technology (TU/e), Netherlands Bahram Zonooz EMAIL Eindhoven University of Technology (TU/e), Netherlands Elahe Arani EMAIL Wayve Technologies Ltd, London, United Kingdom Eindhoven University of Technology (TU/e), Netherlands
Pseudocode Yes Algorithm 1 Consistency-Aware Robust Learning (CARo L) Algorithm
Open Source Code Yes 1Code is available at https://github.com/Neur AI-Lab/CARo L
Open Datasets Yes We evaluate our method across diverse simulated and real-world noisy datasets with varying noise levels. Following prior works (Karim et al., 2022; Li et al., 2020), we introduce symmetric noise on CIFAR-10, CIFAR-100, and Tiny Image Net by randomly replacing labels with uniformly sampled incorrect labels. ... To assess real-world applicability, we evaluate on three fine-grained datasets: Web-Aircraft, Web-Bird, and Web-Car (Sun et al., 2021b).
Dataset Splits Yes A small clean validation set (5% of the training data) is used to finetune the hyperparameters with the best accuracy as the metric of selection.
Hardware Specification No The paper does not explicitly describe the hardware (e.g., GPU/CPU models) used to run its experiments.
Software Dependencies No The paper mentions software components and models like "Pre Act Res Net-18", "SGD Optimizer", "CIFAR10 Auto Augment policy", and "Image Net policy", but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup Yes Unless otherwise stated, we use Pre Act Res Net-18 (He et al., 2016), SGD Optimizer with 0.9 momentum, 5e-4 weight decay, and an initial lr rate of 0.02 with cosine annealing lr scheduler, 64 batch size, and apply random crop and horizontal flip as weak augmentation. ... The selected hyperparameters are listed in Table 5