reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition

Authors: Jianhua Zhu, Wenqi Zhao, Yu Li, Xingjian Hu, Liangcai Gao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on CROHME datasets demonstrate that TAMER outperforms traditional sequence decoding and tree decoding models, especially in handling complex mathematical structures, achieving state-of-the-art (SOTA) performance.
Researcher Affiliation	Academia	Wangxuan Institute of Computer Technology, Peking University, Beijing, China EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methods and model architecture using text, mathematical equations, and figures, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/qingzhenduyu/TAMER/
Open Datasets	Yes	The CROHME Dataset, originating from the Online Handwritten Mathematical Expressions Recognition Competitions (CROHME) (Mouchere et al. 2014; Mouch ere et al. 2016; Mahdavi et al. 2019) held over multiple years, is the preeminent benchmark for handwritten mathematical expression recognition. The HME100K dataset (Yuan et al. 2022) is a large-scale collection of real-scene handwritten mathematical expressions.
Dataset Splits	Yes	The training set comprises 8,836 handwritten mathematical expressions (HMEs), while the test sets from CROHME 2014 (Mouchere et al. 2014), 2016 (Mouch ere et al. 2016), and 2019 (Mahdavi et al. 2019)contain 986, 1,147, and 1,199 HMEs, respectively. The HME100K dataset (Yuan et al. 2022) is a large-scale collection of real-scene handwritten mathematical expressions. It contains 74,502 training images and 24,607 testing images.
Hardware Specification	No	The paper mentions using Dense Net and Transformer models and comparing with other methods, but it does not specify any particular CPU, GPU, or other hardware used for running the experiments.
Software Dependencies	No	The paper mentions models like Dense Net and Transformer and refers to open-source code for the baseline Co MER. However, it does not list specific software dependencies with their version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	No	TAMER uses Co MER(Zhao and Gao 2022) as its baseline, employing a Dense Net(Huang et al. 2017) with the same hyperparameter configuration as the encoder, and the same Transformer(Vaswani et al. 2017) as the decoder. Further details can be found in the Appendix. The specific hyperparameter values or training configurations are not detailed in the main text of the paper.