reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding

Authors: Zhengzhuo Xu, Bowen Qu, Yiyan Qi, SiNan Du, Chengjin Xu, Chun Yuan, Jian Guo

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the effectiveness of the Mo E connector and our initialization strategy, e.g., Chart Mo E improves the accuracy of the previous state-of-the-art from 80.48% to 84.64% on the Chart QA benchmark. Extensive quantitative and qualitative studies demonstrate that Chart Mo E significantly outperforms previous state-of-the-art across several benchmarks by a large margin.
Researcher Affiliation	Academia	Zhengzhuo Xu12 Bowen Qu13 Yiyan Qi1 Sinan Du2 Chengjin Xu1 Chun Yuan2 Jian Guo14 1International Digital Economy Academy 2Tsinghua University 3Peking University 4Hong Kong University of Science and Technology (Guangzhou)
Pseudocode	No	The paper describes the architecture of Chart Mo E and its training stages, but it does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	https://github.com/IDEA-FinAI/ChartMoE
Open Datasets	Yes	Chart Mo E-Align, a dataset with nearly 1 million chart-table-JSON-code quadruples to conduct three alignment tasks (chart-table/JSON/code). Chart QA (Masry et al., 2022), Plot QA (Methani et al., 2020), Chart Y provided by One Chart (Chen et al., 2024), MMC (Liu et al., 2023c), Chart Gemma (Masry et al., 2024b), Chart Bench (Xu et al., 2023), LLaVA-CC3M (Liu et al., 2023d).
Dataset Splits	Yes	We conduct instruction tuning using the training sets of Chart QA and Chart Gemma to adjust the query styles and answer formats of these benchmarks. Chart QA (Masry et al., 2022) test split consists of 1,250 questions in both human and augmented parts.
Hardware Specification	Yes	All training processes are conducted on 4 A100-40G GPUs.
Software Dependencies	No	The paper mentions using 'flash-attention (Dao, 2024)' but does not provide specific version numbers for any software libraries or dependencies, nor does it list multiple key software components with versions.
Experiment Setup	Yes	Table 9: Training hyperparameters of Chart Mo E for all stages. Configuration Alignment Pre-training High-Quality Knowledge Learning Chart Specific Annealing Tuning ... Optimizer Adam W ... Peak Learning Rate 5e-5 ... Global Batch Size 256 ... Gradient Acc. 16 ...