High-Resolution Image Synthesis with Latent Diffusion Models
By Robin Rombach et al
Read the original document by opening this link in a new tab.
Table of Contents
1. Introduction
2. Related Work
3. Method
3.1. Perceptual Image Compression
3.2. Latent Diffusion Models
3.3. Conditioning Mechanisms
Summary
This document discusses the use of Latent Diffusion Models for high-resolution image synthesis. By training diffusion models in a learned latent space, a near-optimal balance between complexity reduction and detail preservation is achieved. The models, known as Latent Diffusion Models (LDMs), show improved performance in image synthesis tasks and significantly reduce computational requirements compared to pixel-based models. The document also introduces a conditioning mechanism for general conditioning inputs and outlines a method to lower the computational demands of training diffusion models.