GlyphSR: A Simple Glyph-Aware Framework for Scene Text Image Super-Resolution

Authors: Baole Wei, Yuxuan Zhou, Liangcai Gao, Zhi Tang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the Text Zoom dataset demonstrate that Glyph SR achieves a new state-of-the-art performance.
Researcher Affiliation Academia Wangxuan Institute of Computer Technology, Peking University, Beijing, China EMAIL, EMAIL
Pseudocode No The paper describes methods using structured text and equations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not provide an explicit statement about releasing its source code, nor does it include a link to a code repository for the methodology described.
Open Datasets Yes Following mainstream STISR works, we conduct experiments on the Text Zoom(Wang et al. 2020) dataset. Text Zoom is captured in real-world scenarios using digital cameras.
Dataset Splits Yes It contains 17,367 LR-HR image pairs for training and 4,373 image pairs for testing. The test set is divided into three subsets: the easy subset with 1,619 image pairs, the medium subset with 1,411 image pairs, and the hard subset with 1,343 image pairs.
Hardware Specification Yes All our experiments are conducted with a single NVIDIA Tesla A800 GPU for both training and testing.
Software Dependencies No The paper mentions using Adam optimizer and Cosine Annealing LR, but does not specify version numbers for any key software components or libraries.
Experiment Setup Yes The number of SRBs N is set to 5, and and the channel sizes are Csr = 64 and Ctp = 32. The maximum text length L is 15. The optimal SRB index for GPM and GFM Ngly is 4... When generating anchor points, the number of step sizes is set to K = 2, with corresponding step sizes {2, 4}. The confidence score threshold τ is set to 0.8 for mask selection. We use Adam optimizer for training. The batch size is set to 48. The number of training epochs is set to 200. The learning rate is initialized to 0.001, and Cosine Annealing LR with 5 epochs linear warmup is used to adjust it. ...The λ1, λ2, λ3 are set to 1, 1, 0.1.