GlyphSR: A Simple Glyph-Aware Framework for Scene Text Image Super-Resolution
Authors: Baole Wei, Yuxuan Zhou, Liangcai Gao, Zhi Tang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the Text Zoom dataset demonstrate that Glyph SR achieves a new state-of-the-art performance. |
| Researcher Affiliation | Academia | Wangxuan Institute of Computer Technology, Peking University, Beijing, China EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods using structured text and equations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code, nor does it include a link to a code repository for the methodology described. |
| Open Datasets | Yes | Following mainstream STISR works, we conduct experiments on the Text Zoom(Wang et al. 2020) dataset. Text Zoom is captured in real-world scenarios using digital cameras. |
| Dataset Splits | Yes | It contains 17,367 LR-HR image pairs for training and 4,373 image pairs for testing. The test set is divided into three subsets: the easy subset with 1,619 image pairs, the medium subset with 1,411 image pairs, and the hard subset with 1,343 image pairs. |
| Hardware Specification | Yes | All our experiments are conducted with a single NVIDIA Tesla A800 GPU for both training and testing. |
| Software Dependencies | No | The paper mentions using Adam optimizer and Cosine Annealing LR, but does not specify version numbers for any key software components or libraries. |
| Experiment Setup | Yes | The number of SRBs N is set to 5, and and the channel sizes are Csr = 64 and Ctp = 32. The maximum text length L is 15. The optimal SRB index for GPM and GFM Ngly is 4... When generating anchor points, the number of step sizes is set to K = 2, with corresponding step sizes {2, 4}. The confidence score threshold τ is set to 0.8 for mask selection. We use Adam optimizer for training. The batch size is set to 48. The number of training epochs is set to 200. The learning rate is initialized to 0.001, and Cosine Annealing LR with 5 epochs linear warmup is used to adjust it. ...The λ1, λ2, λ3 are set to 1, 1, 0.1. |