reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Noise Separation guided Candidate Label Reconstruction for Noisy Partial Label Learning

Authors: Xiaorui Peng, Yuheng Jia, Fuchao Yang, Ran Wang, Min-Ling Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on multiple benchmark datasets confirm the efficacy of the proposed method in addressing NPLL. For example, on CIFAR100 dataset with severe noise, our method improves the classification accuracy of the state-of-the-art one by 11.57%. We extensively evaluate our method on several benchmarks, which validates the efficacy of our method. Extensive experimental results validate that our method outperforms the current state-of-the-art (SOTA) methods by a large margin, e.g., a 11.57% improvement on CIFAR100 with extreme noise and ambiguity level.
Researcher Affiliation	Academia	Xiaorui Peng1, Yuheng Jia2,3 , Fuchao Yang1, Ran Wang4,5, Min-Ling Zhang2,6 1College of Software Engineering, Southeast University 2School of Computer Science and Engineering, Southeast University 3Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University) 4Shenzhen Key Laboratory of Advanced Machine Learning and Applications, School of Mathematical Sciences, Shenzhen University 5Guangdong Provincial Key Laboratory of Intelligent Information Processing, Shenzhen University 6Key Laboratory of Computer Network and Information Integration (Southeast University) EMAIL; EMAIL
Pseudocode	Yes	The pseudo-code is summarized in Algorithm 1.
Open Source Code	Yes	The code is available at: https://github.com/pruirui/PLRC.
Open Datasets	Yes	Following the previous works (Wang et al., 2024; Xu et al., 2023; Qiao et al., 2023), we first evaluated our method on two benchmark datasets, CIFAR10 and CIFAR100 (Krizhevsky et al., 2009). We conducted experiments on three fine-grained datasets CIFAR100H, CUB200 (Welinder et al., 2010) and Flower (Nilsback & Zisserman, 2008). We further evaluated our method on two real-world crowdsourced datasets Treeversity and Benthic (Schmarje et al., 2022).
Dataset Splits	Yes	Following the previous works (Wang et al., 2024; Xu et al., 2023; Qiao et al., 2023), we first evaluated our method on two benchmark datasets, CIFAR10 and CIFAR100 (Krizhevsky et al., 2009). Following the experimental setup (Xu et al., 2023), we splited a clean validation set from the training set to determine hyper-parameters. Then, we transformed the validation set back to its NPLL form and incorporated them into the training set to retrain the model.
Hardware Specification	Yes	All experiments were implemented with Py Torch (Paszke et al., 2019) and carried out with 6 NVIDIA RTX 3090 GPUs and 8 NVIDIA RTX 4090 GPUs.
Software Dependencies	Yes	All experiments were implemented with Py Torch (Paszke et al., 2019) and carried out with 6 NVIDIA RTX 3090 GPUs and 8 NVIDIA RTX 4090 GPUs. For KNN searching, the number of chosen neighbors was set to 5 for all experiments and we employed Faiss (Johnson et al., 2019), a library for efficient similarity search and clustering of dense vectors.
Experiment Setup	Yes	For all the methods, we employed the same backbone. For the CIFAR dataset, we utilized Res Net18, while for the fine-grained datasets CUB200 and Flower and the real-world datasets Treeversity and Benthic, we employed Res Net34 and loaded the pre-trained weights from Image Net for the feature extractor to enhance training efficiency. For all methods, the SGD was used as the optimizer with momentum of 0.9 and weight decay of 0.001. We set the initial learning rate to 0.01 and adjusted it using the cosine scheduler.