reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Graphic Design with Large Multimodal Model

Authors: Yutao Cheng, Zhao Zhang, Maoke Yang, Hui Nie, Chunyuan Li, Xinglong Wu, Jie Shao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Creati Graphist outperforms prior arts and establishes a strong baseline for this field. ... After quantitative and qualitative analysis, it is demonstrated that Creati Graphist is a state-of-the-art solution that not only performs well on traditional GLG task but also achieves remarkable results on the HLG task. The paper includes sections like 'Experiment Datasets', 'Evaluation Metrics', 'Comparison with So TA', and 'Ablation Studies', along with tables of results, indicating empirical studies.
Researcher Affiliation	Collaboration	The authors are affiliated with '1 Byte Dance Inc.' (an industry entity) and '2 Institute of Computing Technology, Chinese Academy of Sciences' (an academic institution), indicating a collaboration.
Pseudocode	No	The paper describes the Creati Graphist architecture and training strategy in prose, and includes a pipeline diagram (Figure 2), but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions a 'Creati Graphist web demo' but does not explicitly state that the source code for the methodology described in this paper is publicly available, nor does it provide a link to a code repository.
Open Datasets	Yes	The paper explicitly mentions and provides access information for the Crello dataset: 'Crello dataset 1 furnishes an array of graphic compositions derived from a web-based design utility...1https://huggingface.co/datasets/cyberagent/crello'. It also cites 'Flickr30k(Plummer et al. 2015)' as a training dataset.
Dataset Splits	Yes	The paper states: 'In Flex-DM (Inoue et al. 2023b), the dataset is partitioned into 19,095 training, 1,951 validation, and 2,375 testing examples. ... we used the intersection of all parts in the two version test sets, a total of 242 graphic compositions as the test set in experiments.'
Hardware Specification	No	The paper describes the model architecture, training strategy, and experimental results, but does not specify the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using specific models like ViT-L/14 (initialized with CLIP parameters) and Qwen1.5-0.5B/7B for the LLM foundation, but it does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	The paper provides specific experimental setup details in the 'Training Strategy' section and Table 1, including batch size ('BS' 128, 64), sequence length ('Length' 1536, 2048, 3584), training steps (10k for Stage-1, 20k for Stage-2 and Stage-3), and a random shuffling probability (0.75) for input elements.