reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Explicit Relational Reasoning Network for Scene Text Detection

Authors: Yuchen Su, Zhineng Chen, Yongkun Du, Zhilong Ji, Kai Hu, Jinfeng Bai, Xieping Gao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on challenging benchmarks demonstrate the effectiveness of our ERRNet. It consistently achieves state-of-the-art accuracy while holding highly competitive inference speed.
Researcher Affiliation	Collaboration	1 School of Computer Science, Fudan University 2 Tomorrow Advancing Life 3 School of Computer Science, Xiangtan University 4 Laboratory for Artificial Intelligence and International Communication, Hunan Normal University
Pseudocode	No	The paper describes the methodology in narrative text and through architectural diagrams (e.g., Figure 3) and mathematical formulas (e.g., B-spline interpolation, loss functions), but does not include any distinct pseudocode blocks or algorithms.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or provide a link to a code repository.
Open Datasets	Yes	Total-Text (Ch ng and Chan 2017) includes horizontal, curved, and multi-oriented texts. CTW1500 (Liu et al. 2019) is a challenging dataset for long curved text. Ar T (Chng et al. 2019) is a large-scale multi-lingual arbitrary-shaped text detection dataset. MSRA-TD500 (Yao et al. 2012) is a multi-language dataset. Synth150K (Liu et al. 2020) contains 150k synthetic images.
Dataset Splits	Yes	Total-Text includes 1255 training images and 300 test images. CTW1500 ... consists of 1000 training images and 500 test images. Ar T ... includes 5603 training images and 4563 test images. MSRA-TD500 ... consists of 300 training images and 200 test images.
Hardware Specification	Yes	All listed FPS is measured from a single NVIDIA RTX3090 GPU. All experiments are conducted on 4 NVIDIA RTX3090 GPUs.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used.
Experiment Setup	Yes	When training from scratch, we adopt Adam W with 1 10 4 weight decay as the optimizer, and set 16 as batch size, with 500 training epochs for all datasets. For the ERR decoder, the number of layers is 3, the maximum text instance number n is 100, and the component sequence length t is 6. For the position-supervised loss, the parameters α and γ are set to 0.25 and 2, respectively. For data augmentation, we apply Random Crop, Random Rotate and Colo Jitter to input images. In the testing stage, we set a suitable height for each dataset while keeping the original aspect ratio. The evaluation metric for the F-measure is IOU@0.5, following (Ye et al. 2023; Chen et al. 2024).