reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SemiNLL: A Framework of Noisy-Label Learning by Semi-Supervised Learning

Authors: ZHUOWEI WANG, Jing Jiang, Bo Han, Lei Feng, Bo An, Gang Niu, Guodong Long

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct a systematic and detailed analysis of the combinations of possible components based on our framework. Our framework can absorb various SS strategies and SSL backbones, utilizing their power to achieve promising performance. The instantiations of our framework demonstrate substantial improvements over state-of-the-art methods on benchmark-simulated and real-world datasets with noisy labels. ... We conduct extensive experiments on benchmark-simulated and real-world datasets with noisy labels. Our instantiations, Divide Mix+ and GPL, outperform other state-of-the-art noisy-label learning methods. We also analyze the effects and efficiencies of different instantiations of our framework.
Researcher Affiliation	Academia	1University of Technology Sydney 2Hong Kong Baptist University 3Chongqing University 4Nanyang Technological University 5RIKEN Center for Advanced Intelligence Project EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Semi NLL Input: Network fθ, SS strategy select, SSL method semi, epoch Tmax, iteration Imax; 1: for t = 1,2,. . . ,Tmax do 2: Shuffle training set Dtrain; 3: for n = 1, . . . , Imax do 4: Fetch mini-batch Dn from Dtrain; 5: Obtain Xm, Um select(Dn, fθ); 6: Update fθ semi(Xm, Um, fθ); 7: end for 8: end for Output: fθ
Open Source Code	No	The paper does not provide an explicit statement about releasing its own code, a direct link to a code repository, or state that the code is in supplementary materials for the methodology described.
Open Datasets	Yes	Datasets. We compare our method with five state-of-the-art algorithms in PU learning: We thoroughly evaluate our proposed Divide Mix+ and GPL on six datasets, including MNIST (Lecun et al., 1998), FASHION-MNIST (Xiao et al., 2017), CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), Clothing1M (Xiao et al., 2015), and Web Vision (Li et al., 2017).
Dataset Splits	Yes	MNIST and FASHION-MNIST contain 60K training images and 10K test images of size 28 28. CIFAR-10 and CIFAR-100 contain 50K training images and 10K test images of size 32 32 with three channels. Clothing1M is a large-scale real-world dataset that consists of one million training images of size 224 224 from online shopping websites with labels annotated from surrounding texts. The estimated noise ratio is approximately 40% Xiao et al. (2015).
Hardware Specification	Yes	We implement all methods by Py Torch on NVIDIA Tesla V100 GPUs. ... In Table 7, we study the efficiency of these two components by dividing all our instantiations into two parts and calculating their training time on CIFAR-10 sym 20% using an NVIDIA RTX 6000 GPU.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify a version number for it or any other software libraries.
Experiment Setup	Yes	For FASHION-MNIST, the network is trained using stochastic gradient descent (SGD) with 0.9 momentum and a weight decay of 1 10 4 for 120 epochs. For MNIST, CIFAR-10, and CIFAR-100, all networks are trained using SGD with 0.9 momentum and a weight decay of 5 10 4 for 300 epochs.