Blending Two Styles: Generating Inter-domain Images with MiddleGAN
Authors: Collin MacDonald, Zhendong Chu, John Stankovic, Huajie Shao, Gang Zhou, Ashley Gao
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluating Middle GAN on the Celeb A dataset, we demonstrate that it successfully generates images that lie between the distributions of the input sets, both mathematically and visually. An additional experiment verifies the viability of Middle GAN on handwritten digit datasets (DIDA and MNIST). |
| Researcher Affiliation | Academia | Collin Mac Donald EMAIL Department of Computer Science William & Mary Zhendong Chu EMAIL Department of Computer Science University of Virginia John Stankovic EMAIL Department of Computer Science University of Virginia Huajie Shao EMAIL Department of Computer Science William & Mary Gang Zhou EMAIL Department of Computer Science William & Mary Ye Gao EMAIL Department of Computer Science William & Mary |
| Pseudocode | No | The paper describes the model architecture and objective functions mathematically (e.g., Equation 2 for min G max Da, Db V(G, Da, Db)), and provides theoretical proofs for optimal parameters, but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Our implementation of Middle GAN builds upon Py Torch s official implementation of DCGAN (Tutorials). This implementation was designed for use with the Celeb A dataset, which sped up our initial development time. ... The paper authors provided a Py Torch implementation of Trans GAN (git, b) (which itself was based on their earlier work, Auto GAN (git, a)), we opted for a cleaner re-implementation of Trans GAN (Sarıgün). |
| Open Datasets | Yes | We primarily evaluated Middle GAN using the Celeb A dataset (Liu et al., 2015), ... We selected the MNIST dataset (Le Cun et al., 2010), published in 2010, and the DIDA dataset (Kusetogullari et al., 2020b;a), published in 2020. |
| Dataset Splits | Yes | To create the two domains we used the boolean Male identifier present in the original dataset annotations. By splitting on the Male identifier we created two smaller datasets, which we designated Male (with approx. 84,000 images) and Female (with approx. 118,000 images). |
| Hardware Specification | Yes | All models were trained on a NVIDIA A100 GPU, with an average per-model training time of around 12 hours. |
| Software Dependencies | No | Our implementation of Middle GAN builds upon Py Torch s official implementation of DCGAN (Tutorials). ... leveraging the itertools library s Cycle method ... We elected to use the pre-trained Res Net101 model provided by Py Torch ... Leveraging principal component analysis (PCA), we reduced the length of each extracted vector to 50 (skl, b). After performing PCA, we input the feature vectors into t-SNE ... scikit-learn implementation of t-SNE. ... We elected to use the pre-trained Google Le Net model provided by Py Torch as the basis of our feature extractor (pyt, b; Szegedy et al., 2014). |
| Experiment Setup | Yes | Table 1: Middle GAN Hyperparameters Hyperparameter Description Values Source Batch Size Batch size during training. 128 DCGAN NZ Latent space for generator. 128 Epochs Epochs during training. 200 LRD Learning rate of discriminators. 0.0001 Trans GAN LRG Learning rate of generator. 0.0001 Trans GAN B1 Beta 1 for Adam optimizers. 0.0 Trans GAN B2 Beta 2 for Adam optimizers. 0.999 DCGAN N Critic Discriminator to generator training ratio. GP Weight Gradient penalty weight (used in loss function). Blend Ratio The blend ratio between input domains. 0.25, 0.50, 0.75 Image Size The size of the generated images. 32px, 64px, 128px - |