reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

Authors: Bencheng Liao, Xinggang Wang, Lianghui Zhu, Qian Zhang, Chang Huang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to validate the effectiveness of our proposed models. We present the main results on Image Net (Deng et al. 2009). Additionally, we benchmark our model on downstream dense prediction tasks, including object detection on the COCO (Lin et al. 2014) dataset and semantic segmentation on ADE20K (Zhou et al. 2019).
Researcher Affiliation	Collaboration	Bencheng Liao1, 2, Xinggang Wang2, *, Lianghui Zhu2, Qian Zhang3, Chang Huang3 1Institute of Artificial Intelligence, Huazhong University of Science & Technology 2School of EIC, Huazhong University of Science & Technology 3Horizon Robotics EMAIL, EMAIL
Pseudocode	No	The paper describes the Gated Linear Attention (GLA) and Bidirectional Gated Linear Attention (Bi GLA) mechanisms using mathematical formulas and textual descriptions, but it does not include a structured pseudocode block or algorithm.
Open Source Code	Yes	Code https://github.com/hustvl/Vi G
Open Datasets	Yes	We present the main results on Image Net (Deng et al. 2009). Additionally, we benchmark our model on downstream dense prediction tasks, including object detection on the COCO (Lin et al. 2014) dataset and semantic segmentation on ADE20K (Zhou et al. 2019).
Dataset Splits	No	The paper states: "We train classification experiments on Image Net-1K dataset... We mainly follow the training and evaluation setting of Dei T and Swin Transformer (Touvron et al. 2021; Liu et al. 2021b). All the models are trained from scratch for 300 epochs. Further details are provided in extended version." While it refers to existing benchmarks and training settings, it does not explicitly provide the specific dataset split percentages or counts within the main text.
Hardware Specification	Yes	Tp. (images/s) is measured on a single 4090 GPU with batch size 256 following (Liu et al. 2021b). ... Throughput and memory are test on 4090 GPU with batch size 256 and image size 224.
Software Dependencies	No	The paper does not explicitly mention any specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	All the models are trained from scratch for 300 epochs. ... Tp. (images/s) is measured on a single 4090 GPU with batch size 256... Training details are the same as VRWKV (Duan et al. 2024) and Vim (Zhu et al. 2024).