Euler-Lagrange Analysis of Generative Adversarial Networks

Authors: Siddarth Asokan, Chandra Sekhar Seelamantula

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results based on synthesized Gaussian data demonstrate superior convergence behavior of the proposed approach in comparison with the baseline WGAN variants that employ weight-clipping, gradient or Lipschitz penalties on the discriminator on low-dimensional data. We demonstrate applications to real-world images considering latent-space prior matching in Wasserstein autoencoders and present performance comparisons on benchmark datasets such as MNIST, SVHN, Celeb A, CIFAR-10, and Ukiyo-E.
Researcher Affiliation Academia Siddarth Asokan EMAIL Robert Bosch Centre for Cyber-Physical Systems Indian Institute of Science Bengaluru-560012, India Chandra Sekhar Seelamantula EMAIL Department of Electrical Engineering Indian Institute of Science Bengaluru-560012, India
Pseudocode Yes Algorithm 1 WAEFR Training the Wasserstein Autoencoder with a Fourier-series discriminator. Inputs: Training data x pd, prior distribution N(µz, Σz), batch size N, learning rate η, number of GAN pre-training epochs n GAN Models: Encoder/Generator Encθ; Decoder Decψ; Fourier-series discriminator D FS. GAN pre-training: for n GAN iterations do Sample: x pd A batch of N real data samples. Sample: z = Encθ(x) Latent encoding of real data. Sample: z N(µz, Σz) A batch of N prior distribution samples. Compute: Fourier coefficients αm and βm Compute: Discriminator coefficients γm Compute: Optimal Lagrange multiplier λ FS Evaluate: WGAN-FS loss LG(D FS( z), D FS(z)) Update: Generator Encθ η θ[LG] end for WAEFR training: while Encθ, Decψ not converged do Sample: x pd A batch of N real samples. Sample: z = Encθ(x) Latent encoding of real samples. Sample: x = Decψ( z) Reconstructed samples. Evaluate: Autoencoder Loss: LAE(x, x) Update: Autoencoder Encθ η θ[LAE]; Decψ η ψ[LAE] Sample: z N(µz, Σz) A batch of N prior distribution samples. Compute: Fourier coefficients αm, βm, and γm Compute: Optimal Lagrange multiplier λ FS Evaluate: WGAN-FS loss LG(D FS( z), D FS(z)) Update: Generator Encθ η θ[LG] end while Output: Reconstructed random prior samples: Decψ(z)
Open Source Code Yes The source code for the Tensor Flow 2.0 (Abadi et al., 2016) implementations and the comparisons presented in this paper, the pre-trained models, and high-resolution images of the results are available at https://github.com/Darth Sid95/ELF_GANs.
Open Datasets Yes We present results obtained by training the WAE on several datasets such as MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), Celeb A (Liu et al., 2015), Ukiyo-E (Pinkney and Adler, 2020), and CIFAR-10 (Krizhevsky, 2009).
Dataset Splits No The models are trained on 2 104 batches for MNIST, 5 104 batches for CIFAR-10, 7 104 batches for SVHN and 105 batches for Celeb A and Ukiyo-E. The quality of the autoencoder s reconstructed samples, measured in terms of the average reconstruction error RE on unseen test set images.
Hardware Specification No The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types) used for the experiments.
Software Dependencies Yes The implementation was carried out using Tensor Flow 2.0 (Abadi et al., 2016).
Experiment Setup Yes The generator in all GAN variants is considered to be a linear transformation of the input: y = w z + b. Gaussian training data is drawn from N(10, 1), while noise z that is input to the generator is sampled from the standard Gaussian N(0, 1). While WGAN-FS uses a closed-form Fourier-series discriminator, the baselines use a three-layer fully connected discriminator network with leaky Re LU activation. The batch size is 500. For the baseline techniques, each training step involves 5 iterations of the discriminator network optimization followed by one iteration of the generator. WGAN-FS, on the other hand, uses a single-shot discriminator during each training step. Based on additional experiments conducted in Appendix E.1, we set the period T = 2π ωo to 15 and the truncation frequency M to 10. The Adam optimizer (Kingma and Ba, 2015) is used with a learning rate η = 0.05, and the exponential decay parameters for the first and second moments are β1 = 0.5 and β2 = 0.999, respectively. The convolutional autoencoder model proposed by Tolstikhin et al. (2018) is employed for both the baseline WAEs and WAEFR. The prior distribution is a 16-D Gaussian for MNIST, and 64-D Gaussian for the other datasets. In WAEFR, the Fourier-series period is set to T = 15, and the latent representation is passed through a linear activation with saturation (clipping) of the latent vector amplitudes beyond [ 10, 10] in order to prevent latching on to an aliased Fourier representation. A batch size of 150 is used. The networks are trained using the Adam optimizer (Kingma and Ba, 2015). A learning rate of 2 10 4 is used for all the variants. The models are trained on 2 104 batches for MNIST, 5 104 batches for CIFAR-10, 7 104 batches for SVHN and 105 batches for Celeb A and Ukiyo-E. For all WAE baselines, the discriminator is updated thrice for every update of the generator. Pre-training the GAN component (Encoder-discriminator pair) for 10 epochs was found to result in faster convergence across all WAE-GAN variants.