TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition

Authors: Jianhua Zhu, Wenqi Zhao, Yu Li, Xingjian Hu, Liangcai Gao

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on CROHME datasets demonstrate that TAMER outperforms traditional sequence decoding and tree decoding models, especially in handling complex mathematical structures, achieving state-of-the-art (SOTA) performance.
Researcher Affiliation Academia Wangxuan Institute of Computer Technology, Peking University, Beijing, China EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methods and model architecture using text, mathematical equations, and figures, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/qingzhenduyu/TAMER/
Open Datasets Yes The CROHME Dataset, originating from the Online Handwritten Mathematical Expressions Recognition Competitions (CROHME) (Mouchere et al. 2014; Mouch ere et al. 2016; Mahdavi et al. 2019) held over multiple years, is the preeminent benchmark for handwritten mathematical expression recognition. The HME100K dataset (Yuan et al. 2022) is a large-scale collection of real-scene handwritten mathematical expressions.
Dataset Splits Yes The training set comprises 8,836 handwritten mathematical expressions (HMEs), while the test sets from CROHME 2014 (Mouchere et al. 2014), 2016 (Mouch ere et al. 2016), and 2019 (Mahdavi et al. 2019)contain 986, 1,147, and 1,199 HMEs, respectively. The HME100K dataset (Yuan et al. 2022) is a large-scale collection of real-scene handwritten mathematical expressions. It contains 74,502 training images and 24,607 testing images.
Hardware Specification No The paper mentions using Dense Net and Transformer models and comparing with other methods, but it does not specify any particular CPU, GPU, or other hardware used for running the experiments.
Software Dependencies No The paper mentions models like Dense Net and Transformer and refers to open-source code for the baseline Co MER. However, it does not list specific software dependencies with their version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup No TAMER uses Co MER(Zhao and Gao 2022) as its baseline, employing a Dense Net(Huang et al. 2017) with the same hyperparameter configuration as the encoder, and the same Transformer(Vaswani et al. 2017) as the decoder. Further details can be found in the Appendix. The specific hyperparameter values or training configurations are not detailed in the main text of the paper.