reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dual-windowed Vision Transformer with Angular Self- Attention

Authors: Weili Shi, Sheng Li

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate DWAVi T on multiple computer vision benchmarks, including image classification on Image Net-1K, object detection on COCO, and semantic segmentation on ADE20K. Our experimental results also suggest that our model can achieve promising performance on the tasks while maintaining comparable computational cost with that of the baseline models (e.g., Swin Transformer).
Researcher Affiliation	Academia	Weili Shi EMAIL School of Data Science University of Virginia Sheng Li EMAIL School of Data Science University of Virginia
Pseudocode	No	The paper includes a 'Proposition 1' and its proof in the Theoretical Analysis section (3.6) which contains mathematical formulas and logical steps. However, it does not present a clearly structured 'Pseudocode' or 'Algorithm' block, figure, or section.
Open Source Code	Yes	The source code is available at https://github.com/Damo SWL/DWAVi T.
Open Datasets	Yes	We evaluate our proposed DWAVi T on Image Net-1K (Deng et al., 2009) classification, COCO (Lin et al., 2014) object detection, and ADE20K (Zhou et al., 2017) semantic segmentation.
Dataset Splits	Yes	The COCO dataset has 118K images for training and 5K images for validation.
Hardware Specification	Yes	All the experiments are running on NVIDIA A100.
Software Dependencies	No	The paper mentions 'Adam W' for optimization and 'MMDetection toolbox' and 'MMSegmentation toolbox' as frameworks. However, specific version numbers for these software dependencies or other core libraries like Python or PyTorch are not provided.
Experiment Setup	Yes	The total training epoch is 300 with the first 20 epochs as warm-up. We adopt the Adam W (Kingma & Ba, 2014) algorithm to optimize the model. The initial learning rate is 1.2e-3 and the weight decay is 0.05. The learning rate is adjusted according to the cosine learning rate schedule. The drop path rate is 0.1 and the input image is resized to 224 x 224. The mlp ratio for all the DWAVi T variants is set to 4. The number of windows in each stage is (100,49), (49,16), (4,1), (1,1). The temperature in an angular self-attention is 0.1 for DWAVi T-T and DWAVi T-S and 0.25 for DWAVi T-B, respectively. The linear function is adopted to simplify the computation of the quadratic self-attention and τ is set to 0.4.