Latent Diffusion-Enhanced Virtual Try-On via Optimized Pseudo-Label Generation
Authors: Chenghu Du, Junyin Wang, Feng Yu, Shengwu Xiong
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extended experiments demonstrate that our proposed method is superior to state-of-the-art methods. Our experiments use VITON-HD (Choi et al. 2021), VITON (Han et al. 2018), which are two challenging datasets in virtual try-on. |
| Researcher Affiliation | Academia | 1 School of Computer Science and Artificial Intelligence, Wuhan University of Technology 2 Shanghai Artificial Intelligence Laboratory 3 Interdisciplinary Artificial Intelligence Research Institute, Wuhan College 4 School of Computer Science and Artificial Intelligence, Wuhan Textile University |
| Pseudocode | No | The paper describes the proposed method in detail using mathematical formulations and textual descriptions, but it does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Our experiments use VITON-HD (Choi et al. 2021), VITON (Han et al. 2018), which are two challenging datasets in virtual try-on. |
| Dataset Splits | Yes | The dataset is divided into a training set with 14,221 groups and a testing set with 2,032 groups. ... It includes 13,679 image groups and is split into a training set with 11,647 groups and a testing set with 2,032 groups. |
| Hardware Specification | Yes | During the training process, we utilize two NVIDIA RTX 4090 GPUs for a duration of 2 days. |
| Software Dependencies | Yes | We employ Stable Diffusion v1.4 (Rombach et al. 2022) as the backbone for our architecture and initialize its denoising U-Net with the weights from the U-Net in Pb E (Yang et al. 2023). |
| Experiment Setup | Yes | The Adam W optimizer (Loshchilov and Hutter 2017) is employed with a learning rate of 1 10 4, and the batch size is set to 2 for training over 40 epochs. For inference, we adopt the pseudo linear multi-step (PLMS) sampling method (Liu et al. 2021b), setting the number of sampling steps to 50. The hyperparameter in the loss function is set as follows: λ = 1 10 4. |