reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bi-Level Optimization for Semi-Supervised Learning with Pseudo-Labeling

Authors: Marzi Heidari, Yuhong Guo

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the effectiveness of the proposed approach, we conduct extensive experiments on multiple SSL benchmarks. The experimental results show the proposed BOPL outperforms the state-of-the-art SSL techniques.
Researcher Affiliation	Academia	Marzi Heidari1, Yuhong Guo1,2 1School of Computer Science, Carleton University, Ottawa, Canada 2CIFAR AI Chair, Amii, Canada EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Training Algorithm for BOPL
Open Source Code	No	The paper does not explicitly provide a link to source code or state that code has been made available in supplementary materials or a public repository.
Open Datasets	Yes	We conducted comprehensive experiments on four commonly used image classification benchmarks: CIFAR-10, CIFAR-100 (Krizhevsky, Hinton et al. 2009), SVHN (Netzer et al. 2011) and STL-10 (Coates, Ng, and Lee 2011).
Dataset Splits	Yes	We conducted experiments on CIFAR-10 with 250, 1,000, 2,000, and 4,000 labeled samples, on CIFAR-100 with 2,500, 4,000, and 10,000 labeled samples, on SVHN with 1000 and 500 labeled samples, and on STL-10 with 1,000 images as the labeled data.
Hardware Specification	No	The paper mentions different backbone networks (e.g., WRN-28-2, WRN-28-8) and optimizers, but does not specify any particular hardware (e.g., GPU, CPU models) used for the experiments.
Software Dependencies	No	The paper mentions using optimizers like SGD and techniques like cosine learning rate annealing, but does not specify any software libraries or frameworks (e.g., PyTorch, TensorFlow) with their version numbers.
Experiment Setup	Yes	For training CNN-13, we employed the SGD optimizer with a Nesterov momentum (Nesterov 1983) of 0.9, an L2 regularization coefficient of 1e-4 for CIFAR-10 and CIFAR-100 datasets and 5e-5 for SVHN, and an initial learning rate α of 0.1. ... For the WRN-28-2 model, the training configuration includes the SGD optimizer, an L2 regularization coefficient of 5e-4, and an initial learning rate of 0.01. ... Specifically for BOPL, we set the batch size to 128, λ = 1e-2, ϵ = 1e-2, γ = 0.5, β = 0.999, and η = 1. We pre-train the model for 50 epochs using the Mean-Teacher algorithm and then proceed to train BOPL for 400 epochs. Finally, we fine-tune the model for 100 epochs using both the labeled data and the unlabeled data with learned pseudo-labels.