Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints
Authors: Yuxuan Wu, Ziyu Wang, Bhiksha Raj, Gus Xia
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that V3 generalizes across multiple domains and modalities, successfully learning disentangled content and style representations... Experimental results show that our approach achieves more robust content-style disentanglement than unsupervised baselines, and outperforms even supervised methods... We evaluate V3 on both synthetic and real data... |
| Researcher Affiliation | Academia | Mohamed bin Zayed University of Artificial Intelligence New York University Shanghai Carnegie Mellon University |
| Pseudocode | No | The paper describes the model architecture and methodology using text, equations, and diagrams (Figure 2), but does not include any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | Yes | 1Code is available at https://github.com/Irislucent/variance-versus-invariance. Demo can be found at https://v3-content-style.github.io/V3-demo/. |
| Open Datasets | Yes | Street View House Numbers (SVHN) (Netzer et al., 2011)... Sprites with Actions Dataset (Sprites) (ope; Yingzhen and Mandt, 2018)... Librispeech Clean 100 Hours (Libri100) (Panayotov et al., 2015) |
| Dataset Splits | Yes | The dataset is split into the train set, validation set, and test set with a ratio of 8:1:1. ... We split the extra partition into additional training, validation and testing sets with a ratio of 8:1:1. ... We use 80% of the characters for training and the rest for validation and testing. |
| Hardware Specification | Yes | All models are trained on a single Nvidia RTX 4090 GPU. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and refers to several models like 'Res Net18' but does not specify versions for any programming languages or libraries (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | For all models, we use the Adam optimizer with a learning rate of 0.001 (Kingma and Ba, 2014). The fragment sizes on Phone Nums, Ins Notes, SVHN and Sprites are set to 10, 12, 2 and 6, respectively. The relativity r is set to 15, 15, 5, 10 and 5 on Phone Nums, Ins Notes, SVHN, Sprites and Libri100, respectively... The V3 loss weight β is defaultly set to 1 on Ins Notes task, and 0.1 in other datasets. The commitment loss weight α is set to 0.01. |