Skip to main content
  1. Data Science Blog/

What is GAN?

·1016 words·5 mins· loading · ·
Generative AI AI/ML Models Artificial Intelligence Generative AI Deep Learning (DL) Neural Networks Machine Learning Computer Vision AI Models

Partial Dependence Plots

What is GAN?
#

What is GAN (Generative Adversarial Network)?
#

Generative adversarial networks (GANs) are besing used to generate images, videos, text, audio and music. GAN is a class of machine-learning models introduced by Ian Goodfellow and his colleagues in 2014. The GANs became popular among researchers quickly because of their property to generate new data with the same statistics as the input training set. It can be applied to images, videos, textual data, tabular data and more, proving useful for semi-supervised, fully supervised, and reinforcement learning.

There are 2 kinds of models in the context of Supervised Learning, Generative and Discriminative Models. Discriminative Models are primarily used to solve the Classification task where the model usually learns a decision boundary to predict which class a data point belongs to. On the other side, Generative Models are primarily used to generate synthetic data points that follow the same probability distribution as training data distribution. Our topic of discussion, Generative Adversarial Networks(GANs) is an example of the Generative Model.

The most popular GAN Architectures and their Purpose#

  • Transforming an image from one domain to another (CycleGAN),
  • Generating an image from a textual description (text-to-image),
  • Generating very high-resolution images (ProgressiveGAN) and many more.
  • pixelRNN
  • DiscoGAN
  • lsGAN

Image generation projects using GAN
#

Anime Characters
#

GANs are changing the way of generating realistic anime characters and bringing out the potential of complex GAN architecture to build entire anime series with the help of AI. In the paper “Towards the Automatic Anime Characters Creation with Generative Adversarial Networks,” Yanghua Jin and his team trained and used GAN to generate the faces of anime characters or Japanese comic book characters in 2017. The outcome of the project was remarkable, which led people to conduct more experiments on the image generation of faces of anime characters and the generation of pokemon characters. Many GAN models are used to generate cartoon characters, such as DCGAN, StyleGAN, and so on.

Fake Human Faces
#

Facial recognition has many use cases, and the development has been in progress for the last couple of years, where researchers are using different techniques and facial recognition datasets to train models. Researchers need massive datasets of human faces to understand the recognition process, and the generation of fake human faces helps these projects. NVIDIA researchers published a paper, “Progressive Growing of GANs for Improved Quality, Stability, and Variation,” in 2018, where they proposed a new training methodology for GANs operating on the generation of feasible human face photographs. The paper’s outcome is so realistic-looking that it can fool anyone easily. Additionally, the paper presented the generation of objects and scenes as well.

Image Style Transfer
#

Image style transfer is an interesting technique in computer vision that combines two images. This technique consists of a model taking two images, called content and reference images, and the output is a whole new image containing the object of the content image and the style of the reference image. Here by style, it means brush strokes, colors, and textures of the image. Researchers are still trying to find the best ways and use cases of style transfer. This technique is included in the image-to-image translation and is also known as domain adaptation. There are many research work on image style transfer using GANs, and most have produced good results. A remarkable study is the paper “P²-GAN: EFFICIENT STYLE TRANSFER USING SINGLE STYLE IMAGE.” Zhentan Zheng and Jianyi Liu put forth a novel patch permutation GAN (P²-GAN), which proves efficient in learning stroke styles from paintings or single style images. The paper concludes with an effective and precise P²-GAN network simulating the expected stroke style and avoiding the difficulty of collecting image sets with the same style.

Text-to-Image (text-2-image) Synthesis
#

Generating realistic images is challenging, but using GANs makes the process a reality rather than theories. Although we got problems with image-to-image generation or translation, the synthesis of text-to-image realistic images using GANs is more complex and difficult. The process of synthesis needs a strong GAN structure along with the base images provided. There are many papers on the task which have computed impressive outcomes. In 2016, Han Zhang and his team from the Chinese University of Hong Kong presented “StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks,” explaining the use of GANs. Mainly, the StackGAN to generate realistic photographs from textual data of simple objects such as birds and flowers. Also, the paper “Generative Adversarial Text to Image Synthesis” and “TAC-GAN – Text Conditioned Auxiliary Classifier Generative Adversarial Network” present the study of generating realistic images through text-to-image synthesis with the help of GANs.

Image-to-Image Translation
#

Image-to-image translation includes many tasks, such as the translation of semantic images to photographs, satellite photographs to google maps, black and white photographs to color, and more. Many papers have been published demonstrating the use of GANs for image-to-image translation, and one of the popular papers is “Image-to-Image Translation with Conditional Adversarial Networks” by Berkeley AI Research group in 2016. This paper has an investigational approach toward conditional adversarial networks if they serve as a general-purpose solution to image-to-image translation problems. They use a unique pix2pix approach for various image-to-image translation tasks. Additionally, the paper titled “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks” presents the famous CycleGAN and a set of impressive image-to-image translation cases like translation from photograph to artistic painting style, photographs from summer to winter, and translation of horse to zebra.

Applications of GAN
#

  • Realistic image generation,
  • Improving the quality of photographs
  • Audio synthesis,
  • Transfer learning,
  • Deepfake Apps

Vision GAN Application Areas
#

  1. Generate Examples for Image Datasets
  2. Generate Photographs of Human Faces
  3. Generate Realistic Photographs
  4. Generate Cartoon Characters
  5. Image-to-Image Translation
  6. Text-to-Image Translation
  7. Semantic-Image-to-Photo Translation
  8. Face Frontal View Generation
  9. Generate New Human Poses
  10. Photos to Emojis
  11. Photograph Editing
  12. Face Aging
  13. Photo Blending
  14. Super Resolution
  15. Photo Inpainting
  16. Clothing Translation
  17. Video Prediction
  18. 3D Object Generation
  19. Cartoon Face Images
  20. Forged Face Images
  21. Artificial Flower Images
  22. Artificial Bird Images
  23. Drawing Real Objects and Vice-versa
  24. Fake Paintings
  25. Altering Photographs

Anime character generation using IllustrationGAN
#

Related

Quantum Measurement, Randomness, and Everyday Technology
·778 words·4 mins· loading
Interdisciplinary Topics Research & Academia Quantum Physics Quantum Mechanics Quantum Computing Interdisciplinary Topics
Quantum Measurement, Randomness, and Everyday Technology # This is Part 2 of Learning Quantum …
AI Agents as First-Class Citizens: Why Managing the Digital Workforce Is the Next HR Challenge
·2607 words·13 mins· loading
Artificial Intelligence Business & Career Technology Trends & Future AI Integration Future of Work AI Governance Organizational Design Generative AI
AI Agents as First-Class Citizens # Why Managing the Digital Workforce Is the Next HR Challenge …
When Consciousness Becomes Cosmos: Fields, Particles, Matter, and the Emergence of Size
·5741 words·27 mins· loading
Philosophy & Cognitive Science Interdisciplinary Topics Quantum Field Theory Consciousness Physics Advaita Vedanta Philosophy of Mind Emergence Metaphysics
When Consciousness Becomes Cosmos # From Consciousness to Cosmos: Fields, Particles, Matter, and …
Occam's Razor: Why the Simplest Explanation Often Wins
·994 words·5 mins· loading
Philosophy & Cognitive Science Interdisciplinary Topics Data Science Occam's Razor Critical Thinking Scientific Method Simplicity Decision Making Machine Learning Software Development
Occam’s Razor: Why the Simplest Explanation Often Wins # Prefer fewer assumptions until the …
From Claw Code to Clean Room: A Developer's Guide to Re-implementing Software Without Getting Sued
·2854 words·14 mins· loading
AI Ethics & Governance Software Development Technology Trends & Future Clean Room Design Intellectual Property AI Code Generation Software Copyright Trade Secrets Software Development
From Claw Code to Clean Room: A Developer’s Guide to Re-implementing Software Without Getting …