reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SketchAgent: Generating Structured Diagrams from Hand-Drawn Sketches

Authors: Cheng Tan, Qi Chen, Jingxuan Wei, Gaowei Wu, Zhangyang Gao, Siyuan Li, Bihui Yu, Ruifeng Guo, Stan Z. Li

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the effectiveness of our approach, we propose the Sketch2Diagram Benchmark, a comprehensive dataset and evaluation framework encompassing eight diverse diagram categories... Extensive experiments demonstrate that Sketch Agent outperforms state-of-the-art models across key metrics, achieving superior accuracy and visual coherence. (Table 2 and Table 3 provide detailed performance comparisons and ablation studies.)
Researcher Affiliation	Academia	Cheng Tan1,2 , Qi Chen3,4 , Jingxuan Wei3,4 , Gaowei Wu3,4 , Zhangyang Gao1,2, Siyuan Li1,2, Bihui Yu3,4, Ruifeng Guo3,4, Stan Z. Li1 1Westlake University 2Zhejiang University 3University of Chinese Academy of Sciences 4Shenyang Institute of Computing Technology, Chinese Academy of Sciences
Pseudocode	No	The system consists of three modules: the Sketch-to-Code Agent, the Editing Code Agent, and the Check Agent, each responsible for specific tasks. Given a sketch S and a user-specified instruction set Q, Sketch Agent generates an initial code representation, refines it based on additional instructions, and verifies the final output before rendering the structured diagram. The overall workflow is illustrated in Figure 2. (The text describes the process and mathematical formulations, e.g., Ck = Fk(S, Q) and Lk = ... log P(...), but does not present a pseudocode block or algorithm steps.)
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for the Sketch Agent methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	To address the lack of standardized resources for sketch-to-diagram research, we introduce the Sketch2Diagram Benchmark, a comprehensive dataset and evaluation framework designed to support the development and assessment of models for this task. The dataset spans eight diverse diagram categories, including flowcharts, directed graphs, and model architectures, and consists of over 6,000 high-quality examples.
Dataset Splits	Yes	Table 1 summarizes token length statistics for the Sketch2Diagram dataset, categorized by sketch-to-code (S2C) and code-editing (C2C) tasks. The dataset contains a total of 4824 training samples and 1206 test samples.
Hardware Specification	Yes	Both agents were finetuned over four epochs on a 4 × 80GB A100 GPU setup.
Software Dependencies	No	The Sketch-to-Code Agent is based on Qwen2-VL7B [Wang et al., 2024], while the Editing Code Agent utilizes Qwen2.5-Coder-7B [Hui et al., 2024]... The collected .tex files are then compiled into diagram images using standard La Te X compilers. (No specific versions of programming languages, libraries, or compilers are provided beyond the named models themselves.)
Experiment Setup	Yes	Both agents were finetuned over four epochs on a 4 × 80GB A100 GPU setup. The input token length for both agents is set to 4096 tokens.