LEAD: Min-Max Optimization from a Physical Perspective

Authors: Reyhane Askari Hemmat, Amartya Mitra, Guillaume Lajoie, Ioannis Mitliagkas

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, in section 7, we empirically demonstrate that LEAD is computationally efficient. Additionally, we demonstrate that LEAD improves the performance of GANs on different tasks such as 8-Gaussians and CIFAR-10 while comparing the performance of our method against other first and second-order methods.
Researcher Affiliation Academia Reyhane Askari Hemmat EMAIL Department of Computer Science and Operations Research University of Montreal and Mila, Quebec AI Institute Amartya Mitra EMAIL Department of Physics and Astronomy University of California, Riverside Guillaume Lajoie EMAIL Department of Mathematics and Statistics University of Montreal and Mila, Quebec AI Institute Canada CIFAR AI chair Ioannis Mitliagkas EMAIL Department of Computer Science and Operations Research University of Montreal and Mila, Quebec AI Institute Canada CIFAR AI chair
Pseudocode Yes Algorithm 1 Least Action Dynamics (LEAD) Input: learning rate η, momentum β, coupling coefficient α. Initialize: x0 xinit, y0 yinit, t 0 while not converged do t t + 1 gx xf(xt, yt) gxy yt y(gx)(yt yt 1) xt+1 xt + β(xt xt 1) ηgx αgxy yt gy yf(xt, yt) gxy xt x(gy)(xt xt 1) yt+1 yt + β(yt yt 1) + ηgy + αgxy xt end while return (xk+1, yk+1)
Open Source Code Yes The source code for all the experiments is available at https://github.com/lead-minmax-gam es/LEAD.
Open Datasets Yes Finally, we empirically evaluate our method on synthetic setups and CIFAR-10 image generation to demonstrate improvements in GAN training. CIFAR-10 DCGAN: We evaluate LEAD-Adam on the task of CIFAR-10 (Krizhevsky & Hinton, 2009) image generation with a non-zero-sum formulation (non-saturating) on a DCGAN architecture similar to Gulrajani et al. (2017).
Dataset Splits No The paper does not explicitly state the training, validation, and test splits for the datasets used. It mentions 'In our experiments, we compute the FID based on 50,000 samples generated from our model vs 50,000 real samples', which implies a test set for evaluation, but the partitioning for training and validation is not specified.
Hardware Specification No The paper mentions that 'Results are reported on an average of 1000 iterations, computed on the same architecture and the same machine with forced synchronous execution.' However, it does not provide any specific details about the hardware used (e.g., CPU, GPU models, memory).
Software Dependencies No The paper states, 'All the methods are implemented in Py Torch Paszke et al. (2017)'. While PyTorch is mentioned, a specific version number for PyTorch or any other software dependency is not provided, making it insufficient for reproducibility.
Experiment Setup Yes Algorithm 1 Least Action Dynamics (LEAD) Input: learning rate η, momentum β, coupling coefficient α. J.2 CIFAR 10 DCGAN: For the baseline we use Adam with β1 set to 0.5 and β2 set to 0.99. Generator s learning rate is 0.0002 and discriminator s learning rate is 0.0001. The same learning rate and momentum were used to train LEAD model. We also add the mixed derivative term with αd = 0.3 and αg = 0.0. J.3 CIFAR 10 Res Net: For both the Baseline SNGAN and LEAD-Adam we use a β1 of 0.0 and β2 of 0.9 for Adam. Baseline SNGAN uses a learning rate of 0.0002 for both the generator and the discriminator. LEAD-Adam also uses a learning rate of 0.0002 for the generator but 0.0001 for the discriminator. LEAD-Adam uses an α of 0.5 and 0.01 for the generator and the discriminator respectively.