reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning from Noisy Labels via Conditional Distributionally Robust Optimization

Authors: Hui GUO, Grace Yi, Boyu Wang

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results on both synthetic and real-world datasets demonstrate the superiority of our method.
Researcher Affiliation	Academia	Hui Guo Department of Computer Science University of Western Ontario EMAIL Grace Y. Yi Department of Statistical and Actuarial Sciences Department of Computer Science University of Western Ontario EMAIL Boyu Wang Department of Computer Science University of Western Ontario EMAIL
Pseudocode	Yes	Algorithm 1: Learning from Noisy Labels via Conditional Distributionally Robust True Label Posterior with an Adaptive Lagrange multiplier (Adapt CDRP)
Open Source Code	Yes	Code is available at https://github.com/hguo1728/Adapt CDRP.
Open Datasets	Yes	We evaluate the performance of the proposed Adapt CDRP on two datasets, CIFAR-10 and CIFAR-100 [21], by generating synthetic noisy labels (details provided below), as well as four datasets, CIFAR-10N [22], CIFAR-100N [22], Label Me [23, 24], and Animal10N [25], which contain human annotations.
Dataset Splits	Yes	For all datasets except Label Me, we set aside 10% of the original data, together with the corresponding synthetic or human annotated noisy labels, to validate the model selection procedure.
Hardware Specification	Yes	Training times are approximately 3 hours on CIFAR-10 and 5.5 hours on CIFAR-100 using an NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions the Adam optimizer but does not specify software dependencies with version numbers (e.g., PyTorch, TensorFlow, or specific library versions).
Experiment Setup	Yes	A batch size of 128 is maintained across all datasets. We use the Adam optimizer [43] with a weight decay of 5 10 4 for CIFAR-10, CIFAR-100, CIFAR-10N, CIFAR-100N, and Label Me datasets. The initial learning rate for CIFAR-10, CIFAR-100, CIFAR-10N, and CIFAR-100N is set to 10 3, with the networks trained for 120, 150, 120, and 150 epochs respectively. The first 30 epochs serve as a warm-up.