Text-To-Image Diffusion Models in Generative AI: A Survey

By C. Zhang et al
Published on Aug. 10, 2015
Read the original document by opening this link in a new tab.

Table of Contents

Abstract
Introduction
Background on Diffusion Model
Development before DDPM
How does DDPM work for image synthesis?
Guidance in diffusion-based image synthesis
Pioneering Text-To-Image Diffusion Models
Frameworks in pixel space
Frameworks in latent space
Improving Text-To-Image Diffusion Models
Improving model architectures

Summary

This survey reviews text-to-image diffusion models in the context that diffusion models have emerged to be popular for a wide range of generative tasks. The survey starts with a brief introduction of how a basic diffusion model works for image synthesis and how condition or guidance improves learning. It presents a review of state-of-the-art methods on text-conditioned image synthesis, text-to-image, text-guided creative generation, and text-guided image editing. The survey discusses existing challenges and promising future directions in the field of text-to-image diffusion models.
×
This is where the content will go.