An Exemplar-based Framework for Chinese Text Recognition

Authors: Zhao Zhou, Xiangcheng Du, Yingbin Zheng, Xingjiao Wu, Cheng Jin

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on four scenarios of Chinese texts demonstrate the effectiveness of our proposed framework. We also perform our framework on the Chinese text recognition benchmark with four types of real-world Chinese texts, and the extensive experiments demonstrate the superior performance of the proposed DECTR compared with previous approaches.
Researcher Affiliation Collaboration 1Shanghai Key Lab of Intell. Info. Processing, School of CS, Fudan University, Shanghai, China 2Videt Lab, Shanghai, China 3East China Normal University, Shanghai, China 4Innovation Center of Calligraphy and Painting Creation Technology, MCT, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Exemplar Retrieval
Open Source Code No The paper mentions external tools like "oh-my ocr. 2021. text renderer. https://github.com/oh-myocr/text renderer." and "faiss (Johnson, Douze, and J egou 2019)", but does not provide a specific repository link or an explicit statement about releasing the source code for the methodology described in this paper.
Open Datasets Yes We use the datasets from the Chinese text recognition benchmark (Yu et al. 2021) and follow the standard protocols. Images with four types of real-world Chinese texts are used for evaluation, i.e., scene, web, document, and handwriting texts (Figure 5).
Dataset Splits Yes The scene/web/document datasets contain 636,455/140,589/500,000 text-line images in total, with a proportion of 8:1:1 for training, validation, and testing. The handwriting dataset contains 74,603 samples for training, 18,651 for validation and 23,389 for testing.
Hardware Specification Yes We use 4 NVIDIA Titan Xp GPUs to train the networks and the inference is conducted in a single GPU.
Software Dependencies No Our framework is implemented using Py Torch (Paszke et al. 2019). The networks are trained from scratch, using the Adam optimizer (Kingma and Ba 2015)... We also incorporate the efficient similarity search library faiss (Johnson, Douze, and J egou 2019)... The paper mentions software like PyTorch, Adam optimizer, and faiss library, but does not provide specific version numbers for these components.
Experiment Setup Yes The networks are trained from scratch, using the Adam optimizer (Kingma and Ba 2015) with initial learning rate 10 4. The models are trained with batch size of 80 for 40 epochs in total. Fusion(dg, dp) = λfdg + (1 λf)dp (9) Where λr is a trade-off parameter between different stages and we set it as 0.7 by experiment on the validation set.