reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION

Authors: Chuanyang Zheng

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments demonstrating that i Former outperforms existing lightweight networks across various tasks. Notably, i Former achieves an impressive Top-1 accuracy of 80.4% on Image Net-1k with a latency of only 1.10 ms on an i Phone 13, surpassing the recently proposed Mobile Net V4 under similar latency constraints. Additionally, our method shows significant improvements in downstream tasks, including COCO object detection, instance segmentation, and ADE20k semantic segmentation, while still maintaining low latency on mobile devices for high-resolution inputs in these scenarios.
Researcher Affiliation	Academia	Chuanyang Zheng Independent Researcher chuanyang EMAIL
Pseudocode	No	The paper includes equations (1), (2), and (3) to formally describe the modulation mechanism and SHMA, and diagrams in Figure 4 illustrating the architecture. However, it does not contain any sections explicitly labeled "Pseudocode" or "Algorithm", nor structured steps formatted like code or an algorithm.
Open Source Code	Yes	Code and models are available at: https://github.com/Chuanyang Zheng/i Former.
Open Datasets	Yes	We first evaluate our models on classification on Image Net-1K (Deng et al., 2009). ... downstream tasks, including COCO object detection, instance segmentation, and ADE20k semantic segmentation.
Dataset Splits	Yes	We first evaluate our models on classification on Image Net-1K (Deng et al., 2009). To ensure a fair comparison with prior studies, we follow the previous training recipe (Touvron et al., 2021a; Liu et22) and train all models for 300 epochs with a standard image size of 224x224. ... we train Mask R-CNN (He et al., 2017) with i Former as the backbone for 12 epochs (1 ), using the MMDetection toolkit (Chen et al., 2019). ... We conduct experiments on the ADE20K (Zhou et al., 2017) using the Semantic FPN (Kirillov et al., 2019), based on the MMSegmentation toolkit (Contributors, 2020).
Hardware Specification	Yes	Notably, i Former achieves an impressive Top-1 accuracy of 80.4% on Image Net-1k with a latency of only 1.10 ms on an i Phone 13... The latency is measured on an i Phone 13. ... measured on an actual i Phone 13 and compiled by Core ML Tools (Core ML)...
Software Dependencies	No	The paper mentions 'Core ML Tools (Core ML)', 'MMDetection toolkit (Chen et al., 2019)', and 'MMSegmentation toolkit (Contributors, 2020)'. While toolkits are named and cited, no specific version numbers for any of these software dependencies are provided in the main text.
Experiment Setup	No	The paper mentions training models for "300 epochs with a standard image size of 224x224" for ImageNet-1K, and "12 epochs" for Mask R-CNN on COCO. It states that it "follow the previous training recipe (Touvron et al., 2021a; Liu et al., 2022)" for ImageNet, and that it adds "drop path and layer scale" for larger models, which are "commonly used". However, it defers detailed hyperparameters like learning rates, batch sizes, specific optimizers, or other system-level settings to these external references or common practices, without explicitly listing them in the main text.