Predicting the Original Appearance of Damaged Historical Documents

Authors: Zhenhua Yang, Dezhi Peng, Yongxin Shi, Yuyi Zhang, Chongyu Liu, Lianwen Jin

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that the proposed Diff HDR trained on HDR28K significantly surpasses existing approaches and exhibits remarkable performance in handling real scenarios. Quantitative Comparison The quantitative results are shown in Table 1. Diff HDR achieves state-of-the-art performance, surpassing other methods by a substantial margin in FID, LPIPS and Rec-ACC.
Researcher Affiliation Collaboration 1South China University of Technology 2INTSIG-SCUT Joint Lab on Document Analysis and Recognition EMAIL, EMAIL
Pseudocode No The paper describes the framework and training objective using text and a figure (Figure 5) but does not include a specific pseudocode or algorithm block.
Open Source Code Yes Dataset, Code https://github.com/yeungchenwa/HDR
Open Datasets Yes To fill the gap in this field, we propose a large-scale dataset HDR28K and a diffusion-based network Diff HDR for historical document repair. Specifically, HDR28K contains 28,552 damaged-repaired image pairs with character-level annotations and multi-style degradations. Dataset, Code https://github.com/yeungchenwa/HDR
Dataset Splits Yes The training set in HDR28K comprises 22,848 patch images, while the testing set consists of 5,704 patch images. Note that the patch images from the same historical documents are not assigned to both training and testing sets.
Hardware Specification Yes The training is conducted on 8 NVIDIA A6000 GPUs.
Software Dependencies No The paper mentions software components like Adam W optimizer, VGG19, and DPM-Solver++, but does not provide specific version numbers for these or any underlying programming languages or libraries.
Experiment Setup Yes We adopt an Adam W optimizer to train Diff HDR with β1 = 0.95 and β2 = 0.999. The image size is 512 512. During classifier-free training, we set the conditional dropout probability as 8% and we train the model with a batch size of 32 and a total epoch of 165. The learning rate is set as 1 10 4 with the linear schedule. We set the guidance scales sd = 1.2 and sc,m = 1.5 and adopt the DPM-Solver++ (Lu et al. 2022) as our sampler with the inference step of 20.