Euler-Lagrange Analysis of Generative Adversarial Networks
Authors: Siddarth Asokan, Chandra Sekhar Seelamantula
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results based on synthesized Gaussian data demonstrate superior convergence behavior of the proposed approach in comparison with the baseline WGAN variants that employ weight-clipping, gradient or Lipschitz penalties on the discriminator on low-dimensional data. We demonstrate applications to real-world images considering latent-space prior matching in Wasserstein autoencoders and present performance comparisons on benchmark datasets such as MNIST, SVHN, Celeb A, CIFAR-10, and Ukiyo-E. |
| Researcher Affiliation | Academia | Siddarth Asokan EMAIL Robert Bosch Centre for Cyber-Physical Systems Indian Institute of Science Bengaluru-560012, India Chandra Sekhar Seelamantula EMAIL Department of Electrical Engineering Indian Institute of Science Bengaluru-560012, India |
| Pseudocode | Yes | Algorithm 1 WAEFR Training the Wasserstein Autoencoder with a Fourier-series discriminator. Inputs: Training data x pd, prior distribution N(µz, Σz), batch size N, learning rate η, number of GAN pre-training epochs n GAN Models: Encoder/Generator Encθ; Decoder Decψ; Fourier-series discriminator D FS. GAN pre-training: for n GAN iterations do Sample: x pd A batch of N real data samples. Sample: z = Encθ(x) Latent encoding of real data. Sample: z N(µz, Σz) A batch of N prior distribution samples. Compute: Fourier coefficients αm and βm Compute: Discriminator coefficients γm Compute: Optimal Lagrange multiplier λ FS Evaluate: WGAN-FS loss LG(D FS( z), D FS(z)) Update: Generator Encθ η θ[LG] end for WAEFR training: while Encθ, Decψ not converged do Sample: x pd A batch of N real samples. Sample: z = Encθ(x) Latent encoding of real samples. Sample: x = Decψ( z) Reconstructed samples. Evaluate: Autoencoder Loss: LAE(x, x) Update: Autoencoder Encθ η θ[LAE]; Decψ η ψ[LAE] Sample: z N(µz, Σz) A batch of N prior distribution samples. Compute: Fourier coefficients αm, βm, and γm Compute: Optimal Lagrange multiplier λ FS Evaluate: WGAN-FS loss LG(D FS( z), D FS(z)) Update: Generator Encθ η θ[LG] end while Output: Reconstructed random prior samples: Decψ(z) |
| Open Source Code | Yes | The source code for the Tensor Flow 2.0 (Abadi et al., 2016) implementations and the comparisons presented in this paper, the pre-trained models, and high-resolution images of the results are available at https://github.com/Darth Sid95/ELF_GANs. |
| Open Datasets | Yes | We present results obtained by training the WAE on several datasets such as MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), Celeb A (Liu et al., 2015), Ukiyo-E (Pinkney and Adler, 2020), and CIFAR-10 (Krizhevsky, 2009). |
| Dataset Splits | No | The models are trained on 2 104 batches for MNIST, 5 104 batches for CIFAR-10, 7 104 batches for SVHN and 105 batches for Celeb A and Ukiyo-E. The quality of the autoencoder s reconstructed samples, measured in terms of the average reconstruction error RE on unseen test set images. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types) used for the experiments. |
| Software Dependencies | Yes | The implementation was carried out using Tensor Flow 2.0 (Abadi et al., 2016). |
| Experiment Setup | Yes | The generator in all GAN variants is considered to be a linear transformation of the input: y = w z + b. Gaussian training data is drawn from N(10, 1), while noise z that is input to the generator is sampled from the standard Gaussian N(0, 1). While WGAN-FS uses a closed-form Fourier-series discriminator, the baselines use a three-layer fully connected discriminator network with leaky Re LU activation. The batch size is 500. For the baseline techniques, each training step involves 5 iterations of the discriminator network optimization followed by one iteration of the generator. WGAN-FS, on the other hand, uses a single-shot discriminator during each training step. Based on additional experiments conducted in Appendix E.1, we set the period T = 2π ωo to 15 and the truncation frequency M to 10. The Adam optimizer (Kingma and Ba, 2015) is used with a learning rate η = 0.05, and the exponential decay parameters for the first and second moments are β1 = 0.5 and β2 = 0.999, respectively. The convolutional autoencoder model proposed by Tolstikhin et al. (2018) is employed for both the baseline WAEs and WAEFR. The prior distribution is a 16-D Gaussian for MNIST, and 64-D Gaussian for the other datasets. In WAEFR, the Fourier-series period is set to T = 15, and the latent representation is passed through a linear activation with saturation (clipping) of the latent vector amplitudes beyond [ 10, 10] in order to prevent latching on to an aliased Fourier representation. A batch size of 150 is used. The networks are trained using the Adam optimizer (Kingma and Ba, 2015). A learning rate of 2 10 4 is used for all the variants. The models are trained on 2 104 batches for MNIST, 5 104 batches for CIFAR-10, 7 104 batches for SVHN and 105 batches for Celeb A and Ukiyo-E. For all WAE baselines, the discriminator is updated thrice for every update of the generator. Pre-training the GAN component (Encoder-discriminator pair) for 10 epochs was found to result in faster convergence across all WAE-GAN variants. |