Deep Learning Seminar  /  February 18, 2021  -  January 21, 2021, 10:00 – 11:00

Limitations of GAN's – A detailed Look at two Examples

Speaker: Ricard Durall Lopez (Fraunhofer ITWM, Department High Performance Computing)

Abstract:

Combating Mode Collapse in GAN Training
An Empirical Analysis using Hessian Eigenvalues Generative adversarial networks (GANs) provide state-of-the-art results in image generation. However, despite being so powerful,  they still remain very challenging to train. This is in particular caused by their highly non-convex optimization space leading  to a number of instabilities. Among them, mode collapse stands out as one of the most daunting ones. This undesirable event occurs  when the model can only fit a few modes of the data distribution, while ignoring the majority of them. In this work, we combat  mode collapse using second-order gradient information. To do so, we analyse the loss surface through its Hessian eigenvalues,  and show that mode collapse is related to the convergence towards sharp minima. In particular, we observe how the eigenvalues  of the G are directly correlated with the occurrence of mode collapse. Finally, motivated by these findings, we design a new  optimization algorithm called nudged-Adam (NuGAN) that uses spectral information to overcome mode collapse, leading to empirically  more stable convergence properties.
 

Latent Space Conditioning on Generative Adversarial Networks

Generative adversarial networks are the state of the art approach towards learned synthetic image generation. Although early successes were mostly unsupervised, bit by bit, this trend has been superseded by approaches based on labelled data. These supervised  methods allow a much finer-grained control of the output image, offering more flexibility and stability. Nevertheless, the main  drawback of such models is the necessity of annotated data. In this work, we introduce an novel framework that benefits from two  popular learning techniques, adversarial training and representation learning, and takes a step towards unsupervised conditional GANs. In particular, our approach exploits the structure of a latent space (learned by the representation learning) and employs it to  condition the generative model. In this way, we break the traditional dependency between condition and label, substituting the latter  by unsupervised features coming from the latent space. Finally, we show that this new technique is able to produce samples on demand  keeping the quality of its supervised counterpart.