reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dual PatchNorm

Authors: Manoj Kumar, Mostafa Dehghani, Neil Houlsby

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on image classiﬁcation, contrastive learning, semantic segmentation and transfer on downstream classiﬁcation datasets, incorporating this trivial modiﬁcation, often leads to improved accuracy over well-tuned vanilla Vision Transformers and never hurts.
Researcher Affiliation	Industry	Manoj Kumar EMAIL Mostafa Dehghani EMAIL Neil Houlsby EMAIL Google Research, Brain Team
Pseudocode	Yes	1 hp , wp = patch_size [0], patch_size [1] 2 x = einops.rearrange( 3 x, "b (ht hp) (wt wp) c -> b (ht wt) (hp wp c)", hp=hp , wp=wp) 4 x = nn.Layer Norm(name="ln0")(x) 5 x = nn.Dense(output_features , name="dense")(x) 6 x = nn.Layer Norm(name="ln1")(x)
Open Source Code	No	The paper mentions using external libraries like big-vision, Scenic, and einops, and provides a small code snippet in the introduction. However, it does not explicitly state that the authors' implementation code for the Dual Patch Norm methodology described in this paper is publicly available, nor does it provide a direct link to such a repository.
Open Datasets	Yes	We train Vi T architectures (with and without DPN) in a supervised fashion on 3 diﬀerent datasets with varying number of examples: Image Net-1k (1M), Image Net-21k (21M) and JFT (4B) (Zhai et al., 2022a). ... We ﬁnetune Image Net-pretrained B/16 and B/32 with and without DPN on the Visual Task Adaption benchmark (VTAB) (Zhai et al., 2019). ... We ﬁnetune Image Net-pretrained B/16 with and without DPN on the ADE-20K 512 512 (Zhou et al., 2019) semantic segmentation task.
Dataset Splits	Yes	We split the Image Net train set into a train and validation split, and use the validation split to arrive at the ﬁnal DPN recipe. ... We use the VTAB training protocol which deﬁnes a standard train split of 800 examples and a validation split of 200 examples per dataset.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or other accelerator specifications.
Software Dependencies	No	The paper mentions using 'big-vision (Beyer et al., 2022c) and Scenic (Dehghani et al., 2022) library' and 'einops (Rogozhnikov, 2022) library', and refers to 'jax library'. However, it does not provide specific version numbers for these software components, which is required for a reproducible description of ancillary software.
Experiment Setup	Yes	We train 5 architectures: Ti/16, S/16, S/32, B/16 and B/32 using the Aug Reg (Steiner et al., 2022) recipe for 93000 steps with a batch size of 4096... Our full set of hyperparameters are available in Appendix C and Appendix D. ... config.input.batch_size = 4096 ... config.total_epochs = 300 ... config.lr = 0.001 ... config.wd = 0.0001 ... config.schedule = dict( warmup_steps =10 _000 , decay_type ='cosine ')... config.optax_name = 'scale_by_adam ' ... config.grad_clip_norm = 1.0.