MV-VTON: Multi-View Virtual Try-On with Diffusion Models
Authors: Haoyu Wang, Zhilu Zhang, Donglin Di, Shiliang Zhang, Wangmeng Zuo
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that the proposed method not only achieves state-of-the-art results on MV-VTON task using our MVG dataset, but also has superiority on frontal-view virtual try-on task using VITON-HD and Dress Code datasets. Quantitative Evaluation Table 1 reports the quantitative results on the paired setting, and Table 2 shows the unpaired setting s results. |
| Researcher Affiliation | Collaboration | Haoyu Wang1,2,*, Zhilu Zhang2, Donglin Di3, Shiliang Zhang1, Wangmeng Zuo2 1State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University 2Harbin Institute of Technology 3Space AI, Li Auto EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the proposed method using descriptive text and diagrams (e.g., Figure 3: 'Overview of MV-VTON'), but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/hywang2002/MV-VTON |
| Open Datasets | Yes | We conduct extensive experiments not only on MV-VTON task using the MVG dataset, but also on the frontal-view VTON task using VITON-HD (Lee et al. 2022) and Dress Code (Morelli et al. 2022) datasets. |
| Dataset Splits | No | The paper mentions 'For paired test setting' and 'For unpaired test setting', and states 'We follow previous work for the use of them' regarding VITON-HD and Dress Code datasets, but it does not explicitly provide specific percentages or counts for training, validation, and test splits within the paper for any dataset. |
| Hardware Specification | Yes | We train our model on 2 NVIDIA Tesla A100 GPUs for 40 epochs with a batch size of 4 and a learning rate of 1e-5. |
| Software Dependencies | No | The paper mentions using Adam W optimizer and Paint by Example as a backbone, but does not provide specific version numbers for any software libraries, programming languages, or frameworks (e.g., PyTorch, Python, CUDA versions) used for implementation. |
| Experiment Setup | Yes | The hyper-parameter λ1 is set to 1e-1, and λperc is set to 1e-4. We train our model on 2 NVIDIA Tesla A100 GPUs for 40 epochs with a batch size of 4 and a learning rate of 1e-5. We use Adam W (Loshchilov and Hutter 2017) optimizer with β1 = 0.9, β2 = 0.999. |