reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Attentional-Biased Stochastic Gradient Descent

Authors: Qi Qi, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experimental Results on Data Imbalance Problem. We conduct experiments on multiple imbalanced benchmark datasets... 5 Experimental Results on Label Noise Problem. We provide empirical studies on the noisy label datasets in this section.
Researcher Affiliation	Collaboration	Qi Qi1, Yi Xu2, Wotao Yin2, Rong Jin3, Tianbao Yang4*... 1Department of Computer Science, The University of Iowa 2Alibaba Group 3Meta Inc 4Department of Computer Science and Engineering, Texas A&M University
Pseudocode	Yes	Algorithm 1 ABSGD (λ, η, γ, β, s0, w0, T)
Open Source Code	No	The paper does not contain any explicit statement about releasing its own source code or a link to a code repository. The text refers to 'We have implemented the newly proposed structure, named as Convmixer (Trockman & Kolter, 2022)' implying external implementation was used or adapted, not released.
Open Datasets	Yes	We conduct experiments on multiple imbalanced benchmark datasets, including CIFAR-10 (LT), CIFAR-10 (ST), CIFAR-100 (LT), CIFAR-100 (ST), Imaget Net-LT (Liu et al., 2019), Places-LT (Zhou et al., 2017), and i Naturalist2018 (i Natrualist, 2018), and Clothing1M (Xiao et al., 2015) datasets.
Dataset Splits	Yes	The original CIFAR-10 and CIFAR-100 data contain 50,000 training images and 10,000 validation with 10 and 100 classes, respectively. We construct the imbalanced version of training set of CIFAR10, CIFAR100 following the two strategies: Long-Tailed (LT) imbalance (Cao et al., 2019) and Step (ST) imbalance (Buda et al., 2018) with two diﬀerent imbalance ratio ρ = 10, ρ = 100, and keep the testing set unchanged.
Hardware Specification	Yes	To show the eﬃciency of ABSGD, we conduct an experiment on CIFAR-10 data with diﬀerent networks on NVIDIA Ge Force GTX 1080 Ti.
Software Dependencies	No	The paper mentions models like Res Nets and Convmixer, but does not provide specific software dependencies (libraries, frameworks) along with their version numbers.
Experiment Setup	Yes	For fair comparison, ABSGD is implemented with the same hyperparameters such as momentum parameter, initial step size, weight decay and step size decaying strategy, as the baseline momentum SGD method. For ABSGD, the moving average parameter γ are tuned in [0.1 : 0.1 : 1] by default. Following the experimental setting in the literature, the initial learning rate is 0.1 and decays by a factor of 100 at the 160-th, 180-th epoch for both ABSGD and SGD in our experiments, respectively. The value of λ in ABSGD tuned in [1 : 1 : 10].