Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Combating Semantic Contamination in Learning with Label Noise

Authors: Wenxiao Fan, Kan Li

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our method outperforms existing approaches on both synthetic and real-world noisy datasets, effectively mitigating the impact of label noise and Semantic Contamination. Experimental results show that our method advances state-of-the-art results on CIFAR with synthetic label noise, as well as on real-world noisy datasets.
Researcher Affiliation Academia Wenxiao Fan, Kan Li School of Computer Science, Beijing Institute of Technology EMAIL
Pseudocode Yes The overall pipeline is shown in Fig. 4 and the algorithm pseudocode is in Appendix.
Open Source Code No The paper does not explicitly state that the code is open-source, provide a repository link, or mention code in supplementary materials.
Open Datasets Yes Datasets. To verify the effectiveness of our method, we perform our method on classification tasks with six benchmarks: CIFAR-10 (Krizhevsky, Hinton et al. 2009), CIFAR100 (Krizhevsky, Hinton et al. 2009), CIFAR-10N (Wei et al. 2022), CIFAR-100N (Wei et al. 2022), Animal-10N (Song, Kim, and Lee 2019) and Web Vision (Li et al. 2017).
Dataset Splits No The paper mentions running experiments on datasets like CIFAR-10/100, CIFAR-10N/100N, Animal-10N, and Web Vision, and discusses synthetic noise injection and noise rates. It also states "All the results from our runs are the average test accuracy over the last 10 epochs." However, it does not provide explicit details about the training, validation, and test splits (e.g., percentages, sample counts, or specific split files) for these datasets, nor does it refer to specific standard splits by name with citations.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Following (Li, Socher, and Hoi 2020; Chen et al. 2023; Zhang et al. 2023), we apply the regularization Ldiv to increase the diversity of predictions: Following (Li, Socher, and Hoi 2020; Chen et al. 2023), we warm up the models and estimate the label confidence ω using the small-loss criterion and Gaussian Mixture Model (GMM). We vary τ from 0.05 to 0.9 and range c from 0.90 to 0.99.