FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling

Authors: Bowen Zhang, Yidong Wang, Wenxin Hou, HAO WU, Jindong Wang, Manabu Okumura, Takahiro Shinozaki

NeurIPS 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26], and extensively investigate the performance under various labeled data amounts. We mainly compare our method with Pseudo-Labeling [4], UDA [11] and Fix Match [14]... The classification error rates on CIFAR-10/100, STL-10 and SVHN datasets are in Table 1...
Researcher Affiliation Collaboration Bowen Zhang Tokyo Institute of Technology EMAIL Yidong Wang Tokyo Institute of Technology EMAIL Wenxin Hou Microsoft EMAIL Hao Wu Tokyo Institute of Technology EMAIL Jindong Wang Microsoft Research Asia EMAIL Manabu Okumura Tokyo Institute of Technology EMAIL Takahiro Shinozaki Tokyo Institute of Technology EMAIL
Pseudocode Yes Algorithm 1 Flex Match algorithm.
Open Source Code Yes We open-source our code at https://github.com/Torch SSL/Torch SSL.
Open Datasets Yes We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26]
Dataset Splits Yes We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26] and extensively investigate the performance under various labeled data amounts... The classification error rates on CIFAR-10/100, STL-10 and SVHN datasets are in Table 1.
Hardware Specification Yes Figure 2 shows the average running time of a single iteration on a single Ge Force RTX 3090 GPU.
Software Dependencies No The paper mentions 'Py Torch [28]' and cites its publication, but does not provide a specific version number (e.g., PyTorch 1.x.x) for the software dependencies used in the experiments.
Experiment Setup Yes Concretely, the optimizer for all experiments is standard stochastic gradient descent (SGD) with a momentum of 0.9 [29, 30]. For all datasets, we use an initial learning rate of 0.03 with a cosine learning rate decay schedule [31] as η = η0 cos( 7πk / 16K ), where η0 is the initial learning rate, k is the current training step and K is the total training step that is set to 2^20. We also perform an exponential moving average with the momentum of 0.999. The batch size of labeled data is 64 except for Image Net. µ is set to be 1 for Pseudo-Label and 7 for UDA, Fix Match, and Flex Match. τ is set to 0.8 for UDA and 0.95 for Pseudo Label, Fix Match, and Flex Match.