reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers

Authors: Xinyu Li, Qi Yao, Yuanda Wang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We achieve new state-of-the-art results on Dress Code Data, as well as on the largest sewing pattern dataset, namely Garment Code Data. The project website is available at https: //shenfu-research.github.io/Garment-Diffusion/. 4 Experiments 4.1 Datasets 4.2 Multimodal Data Synthesis 4.3 Evaluation Metrics 4.4 Implemention Details 4.5 Comparison with State-of-the-Art Methods 4.6 Ablation Study
Researcher Affiliation	Collaboration	Xinyu Li1,2 , Qi Yao2 , Yuanda Wang2 1Zhejiang University 2Shenfu Research EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in prose and mathematical formulations but does not include any clearly labeled pseudocode or algorithm blocks. For example, Section 3 'Method' details the sewing pattern representation and generation process, but without a structured pseudocode format.
Open Source Code	No	The project website is available at https: //shenfu-research.github.io/Garment-Diffusion/. This is a link to a project website/overview page, not an explicit statement of code release or a direct link to a code repository.
Open Datasets	Yes	We use Sew Factory [Liu et al., 2023], Dress Code Data [Korosteleva and Lee, 2021; He et al., 2024] and Garment Code Data (V2) [Korosteleva et al., 2024] for training and evaluation. For Sew Factory, we employ off-the-shelf rendered garments superimposed on diverse human poses as image prompts (without text prompts). For Dress Code Data and Garment Code Data, we designed multimodal data annotation pipelines (depicted in Figure 4) to generate both text and image prompts for sewing patterns.
Dataset Splits	Yes	For Sew Factory, we use our own version that 90% of randomly selected data points are used for training, with the remaining 10% evenly divided for validation and testing. For Dress Code Data and Garment Code Data (V2), we adhere strictly to the official splits provided by the authors for training, validation, and testing.
Hardware Specification	Yes	Our model is distributedly trained across 8 A10 GPUs (24GB) with the Hugging Face Accelerate library [Gugger et al., 2022].
Software Dependencies	No	The paper mentions several tools and models like 'Adam W optimizer' and 'Hugging Face Accelerate library', 'Open AI Vi T-H/14', 'CLIP', 'Llama-3.1-8B-Instruct', 'Misto Line', and 'Anything-XL fine-tuned from SD-XL'. However, specific version numbers for these software components or libraries are not provided.
Experiment Setup	Yes	We adopt a DDPM noise scheduler for diffusion training, with a maximum of 1, 000 denoising steps and a linear beta scheduler (beta start = 1 10 4, beta end = 2 10 2). We use the Adam W optimizer [Loshchilov and Hutter, 2019] with betas = (0.95, 0.999), a constant learning rate of 1 10 4 and the weight decay of 1 10 2. The training epoch is set to 1, 000 with an early-stop criterion. We evaluate the model at denoising steps of 50, 200, 500, and 1000 every 10 epochs. Based on the results shown in Figure 5, we select 50 denoising steps for inference. The multimodal training is performed in a round-robin fashion, following the order of image prompts, text prompts and image-and-text prompts.