reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling

Authors: Bowen Zhang, Yidong Wang, Wenxin Hou, HAO WU, Jindong Wang, Manabu Okumura, Takahiro Shinozaki

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26], and extensively investigate the performance under various labeled data amounts. We mainly compare our method with Pseudo-Labeling [4], UDA [11] and Fix Match [14]... The classification error rates on CIFAR-10/100, STL-10 and SVHN datasets are in Table 1...
Researcher Affiliation	Collaboration	Bowen Zhang Tokyo Institute of Technology EMAIL Yidong Wang Tokyo Institute of Technology EMAIL Wenxin Hou Microsoft EMAIL Hao Wu Tokyo Institute of Technology EMAIL Jindong Wang Microsoft Research Asia EMAIL Manabu Okumura Tokyo Institute of Technology EMAIL Takahiro Shinozaki Tokyo Institute of Technology EMAIL
Pseudocode	Yes	Algorithm 1 Flex Match algorithm.
Open Source Code	Yes	We open-source our code at https://github.com/Torch SSL/Torch SSL.
Open Datasets	Yes	We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26]
Dataset Splits	Yes	We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26] and extensively investigate the performance under various labeled data amounts... The classification error rates on CIFAR-10/100, STL-10 and SVHN datasets are in Table 1.
Hardware Specification	Yes	Figure 2 shows the average running time of a single iteration on a single Ge Force RTX 3090 GPU.
Software Dependencies	No	The paper mentions 'Py Torch [28]' and cites its publication, but does not provide a specific version number (e.g., PyTorch 1.x.x) for the software dependencies used in the experiments.
Experiment Setup	Yes	Concretely, the optimizer for all experiments is standard stochastic gradient descent (SGD) with a momentum of 0.9 [29, 30]. For all datasets, we use an initial learning rate of 0.03 with a cosine learning rate decay schedule [31] as η = η0 cos( 7πk / 16K ), where η0 is the initial learning rate, k is the current training step and K is the total training step that is set to 2^20. We also perform an exponential moving average with the momentum of 0.999. The batch size of labeled data is 64 except for Image Net. µ is set to be 1 for Pseudo-Label and 7 for UDA, Fix Match, and Flex Match. τ is set to 0.8 for UDA and 0.95 for Pseudo Label, Fix Match, and Flex Match.