Predicting the Original Appearance of Damaged Historical Documents
Authors: Zhenhua Yang, Dezhi Peng, Yongxin Shi, Yuyi Zhang, Chongyu Liu, Lianwen Jin
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that the proposed Diff HDR trained on HDR28K significantly surpasses existing approaches and exhibits remarkable performance in handling real scenarios. Quantitative Comparison The quantitative results are shown in Table 1. Diff HDR achieves state-of-the-art performance, surpassing other methods by a substantial margin in FID, LPIPS and Rec-ACC. |
| Researcher Affiliation | Collaboration | 1South China University of Technology 2INTSIG-SCUT Joint Lab on Document Analysis and Recognition EMAIL, EMAIL |
| Pseudocode | No | The paper describes the framework and training objective using text and a figure (Figure 5) but does not include a specific pseudocode or algorithm block. |
| Open Source Code | Yes | Dataset, Code https://github.com/yeungchenwa/HDR |
| Open Datasets | Yes | To fill the gap in this field, we propose a large-scale dataset HDR28K and a diffusion-based network Diff HDR for historical document repair. Specifically, HDR28K contains 28,552 damaged-repaired image pairs with character-level annotations and multi-style degradations. Dataset, Code https://github.com/yeungchenwa/HDR |
| Dataset Splits | Yes | The training set in HDR28K comprises 22,848 patch images, while the testing set consists of 5,704 patch images. Note that the patch images from the same historical documents are not assigned to both training and testing sets. |
| Hardware Specification | Yes | The training is conducted on 8 NVIDIA A6000 GPUs. |
| Software Dependencies | No | The paper mentions software components like Adam W optimizer, VGG19, and DPM-Solver++, but does not provide specific version numbers for these or any underlying programming languages or libraries. |
| Experiment Setup | Yes | We adopt an Adam W optimizer to train Diff HDR with β1 = 0.95 and β2 = 0.999. The image size is 512 512. During classifier-free training, we set the conditional dropout probability as 8% and we train the model with a batch size of 32 and a total epoch of 165. The learning rate is set as 1 10 4 with the linear schedule. We set the guidance scales sd = 1.2 and sc,m = 1.5 and adopt the DPM-Solver++ (Lu et al. 2022) as our sampler with the inference step of 20. |