Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Oppiskele KL Divergence And The VAE Loss | Variational Autoencoders
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Autoencoders and Representation Learning

bookKL Divergence And The VAE Loss

When training a variational autoencoder (VAE), your goal is not just to reconstruct the input data accurately, but also to ensure that the learned latent representations are structured and meaningful. The VAE achieves this through a specialized loss function that combines two key components: the reconstruction loss and the Kullback-Leibler (KL) divergence.

The mathematical form of the VAE loss for a single data point xx can be written as:

LVAE=Eq(zx)[logp(xz)]DKL(q(zx)p(z))\mathcal{L}_{VAE} = \mathbb{E}_{q(z|x)}[\log p(x|z)] - D_{KL}(q(z|x) \| p(z))

Here, q(zx)q(z|x) is the encoder's approximation of the true posterior over latent variables, and p(z)p(z) is the prior, typically a standard normal distribution. The first term, the expected log-likelihood, is the reconstruction loss: it measures how well the decoder can reconstruct the input from the sampled latent code. The second term is the KL divergence between the approximate posterior and the prior.

Note
Definition

KL divergence (Kullback-Leibler divergence) is a measure of how one probability distribution diverges from a second, reference probability distribution. In the context of VAEs, it quantifies how much the learned latent distribution q(zx)q(z|x) differs from the prior p(z)p(z). A lower KL divergence means the two distributions are more similar.

Including the KL divergence in the VAE loss is crucial because it encourages the encoder's output distribution (q(zx)q(z|x)) to stay close to the chosen prior (p(z)p(z)). If you use a standard normal prior, the KL term penalizes complex or irregular latent distributions, nudging them to be more like a normal distribution centered at zero with unit variance. This regularization ensures that the latent space is smooth and continuous, making it possible to sample new points and generate realistic data from the decoder.

Reconstruction vs. Regularization
expand arrow
  • Increasing the weight of the reconstruction loss leads to more accurate reconstructions, but the latent space may become irregular or overfit to the training data;
  • Increasing the weight of the KL divergence enforces the latent codes to follow the prior distribution more strictly, but may cause the reconstructions to become blurry or less accurate;
  • The VAE loss balances these two objectives, trading off reconstruction quality for a well-behaved and generative latent space.

1. What is the purpose of the KL divergence term in the VAE loss?

2. How does the KL term affect the structure of the latent space?

3. Fill in the blank

question mark

What is the purpose of the KL divergence term in the VAE loss?

Select the correct answer

question mark

How does the KL term affect the structure of the latent space?

Select the correct answer

question-icon

Fill in the blank

The VAE loss combines reconstruction loss and .

Click or drag`n`drop items and fill in the blanks

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 4. Luku 2

Kysy tekoälyä

expand

Kysy tekoälyä

ChatGPT

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

bookKL Divergence And The VAE Loss

Pyyhkäise näyttääksesi valikon

When training a variational autoencoder (VAE), your goal is not just to reconstruct the input data accurately, but also to ensure that the learned latent representations are structured and meaningful. The VAE achieves this through a specialized loss function that combines two key components: the reconstruction loss and the Kullback-Leibler (KL) divergence.

The mathematical form of the VAE loss for a single data point xx can be written as:

LVAE=Eq(zx)[logp(xz)]DKL(q(zx)p(z))\mathcal{L}_{VAE} = \mathbb{E}_{q(z|x)}[\log p(x|z)] - D_{KL}(q(z|x) \| p(z))

Here, q(zx)q(z|x) is the encoder's approximation of the true posterior over latent variables, and p(z)p(z) is the prior, typically a standard normal distribution. The first term, the expected log-likelihood, is the reconstruction loss: it measures how well the decoder can reconstruct the input from the sampled latent code. The second term is the KL divergence between the approximate posterior and the prior.

Note
Definition

KL divergence (Kullback-Leibler divergence) is a measure of how one probability distribution diverges from a second, reference probability distribution. In the context of VAEs, it quantifies how much the learned latent distribution q(zx)q(z|x) differs from the prior p(z)p(z). A lower KL divergence means the two distributions are more similar.

Including the KL divergence in the VAE loss is crucial because it encourages the encoder's output distribution (q(zx)q(z|x)) to stay close to the chosen prior (p(z)p(z)). If you use a standard normal prior, the KL term penalizes complex or irregular latent distributions, nudging them to be more like a normal distribution centered at zero with unit variance. This regularization ensures that the latent space is smooth and continuous, making it possible to sample new points and generate realistic data from the decoder.

Reconstruction vs. Regularization
expand arrow
  • Increasing the weight of the reconstruction loss leads to more accurate reconstructions, but the latent space may become irregular or overfit to the training data;
  • Increasing the weight of the KL divergence enforces the latent codes to follow the prior distribution more strictly, but may cause the reconstructions to become blurry or less accurate;
  • The VAE loss balances these two objectives, trading off reconstruction quality for a well-behaved and generative latent space.

1. What is the purpose of the KL divergence term in the VAE loss?

2. How does the KL term affect the structure of the latent space?

3. Fill in the blank

question mark

What is the purpose of the KL divergence term in the VAE loss?

Select the correct answer

question mark

How does the KL term affect the structure of the latent space?

Select the correct answer

question-icon

Fill in the blank

The VAE loss combines reconstruction loss and .

Click or drag`n`drop items and fill in the blanks

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 4. Luku 2
some-alt