Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
(f,Gamma)-Divergences: Interpolating between f-Divergences and Integral Probability Metrics
Authors: Jeremiah Birrell, Paul Dupuis, Markos A. Katsoulakis, Yannis Pantazis, Luc Rey-Bellet
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We numerically demonstrate these merits in the training of GANs. In Section 6.2, we show that the proposed divergence is capable of adversarial learning of lower dimensional sub-manifold distributions with heavy tails. In this example, both f-GAN (Nowozin et al., 2016) and Wasserstein GAN with gradient penalty (WGAN-GP) (Gulrajani et al., 2017) fail to converge or perform very poorly. Furthermore, in Section 6.3 we present improvements over WGAN-GP and WGAN with spectral-normalization (WGAN-SN) (Miyato et al., 2018), as measured by the inception score (Salimans et al., 2016) and FID score (Heusel et al., 2017) (two standard performance measures), in real data sets and particularly in CIFAR-10 (Krizhevsky, 2009) image generation. |
| Researcher Affiliation | Academia | Jeremiah Birrell EMAIL TRIPODS Institute for Theoretical Foundations of Data Science University of Massachusetts Amherst Amherst, MA 01003, USA Paul Dupuis EMAIL Division of Applied Mathematics Brown University Providence, RI 02912, USA Markos A. Katsoulakis EMAIL Department of Mathematics and Statistics University of Massachusetts Amherst Amherst, MA 01003, USA Yannis Pantazis EMAIL Institute of Applied and Computational Mathematics Foundation for Research and Technology Hellas Heraklion, GR-70013, Greece Luc Rey-Bellet EMAIL Department of Mathematics and Statistics University of Massachusetts Amherst Amherst, MA 01003, USA |
| Pseudocode | No | The paper describes methods textually and mathematically but does not include any explicitly labeled pseudocode or algorithm blocks. It refers to "Algorithm 1 of Gulrajani et al. (2017)" but this is an external reference, not pseudocode presented within this paper. |
| Open Source Code | No | The paper does not contain an explicit statement from the authors about releasing their code for the described methodology, nor does it provide a direct link to a code repository. The acknowledgments mention that "The authors are grateful to Dipjyoti Paul for providing code to build upon and for helping us to run simulations," but this does not constitute a release of their own source code for the work. |
| Open Datasets | Yes | Furthermore, in Section 6.3 we present improvements over WGAN-GP and WGAN with spectral-normalization (WGAN-SN) (Miyato et al., 2018), as measured by the inception score (Salimans et al., 2016) and FID score (Heusel et al., 2017) (two standard performance measures), in real data sets and particularly in CIFAR-10 (Krizhevsky, 2009) image generation. |
| Dataset Splits | Yes | The CIFAR-10 (Krizhevsky, 2009) data set, which consists of 32x32 RGB images from 10 classes, is a widely recognized benchmark dataset with standard, well-defined training and testing splits. The paper implicitly uses these standard splits by referring to the dataset with its citation. |
| Hardware Specification | No | This work was performed in part using high performance computing equipment obtained under a grant from the Collaborative R&D Fund managed by the Massachusetts Technology Collaborative. |
| Software Dependencies | No | Computations were done in Tensor Flow and we used the RMSProp SGD algorithm with a learning rate of 2 10 4. |
| Experiment Setup | Yes | In this example we used gradient-penalty parameter values λ = 10, L = 1; Wasserstein GAN was run with both 1-sided and 2-sided gradient penalties (GP-1 and GP-2 respectively). In all cases the generator and discriminator have three fully connected hidden layers of 64, 32, and 16 nodes respectively, and with Re LU activation functions. The generator uses a 10-dimensional Gaussian noise source. Each SGD iteration was performed with a minibatch size of 1000 and 5 discriminator iterations were performed for every one generator iteration. Computations were done in Tensor Flow and we used the RMSProp SGD algorithm with a learning rate of 2 10 4. ... We employ the adaptive learning rate Adam Optimizer method (Kingma and Ba, 2014) using the hyperparameter values shown in Algorithm 1 of Gulrajani et al. (2017)... In the left panel of Figure 3 we show the results using an initial learning rate of 0.0002; ... In the right panel of Figure 3 we show results using an initial learning rate of 0.001 |