Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Forward Process Definition | Mathematical Foundations of Diffusion Models
Diffusion Models and Generative Foundations

bookForward Process Definition

To understand the forward diffusion process in diffusion models, you need to formalize how noise is gradually added to data in a controlled, stepwise manner. The process is defined as a Markov chain, where at each time step t, a small amount of Gaussian noise is added to the previous state. This stepwise corruption is governed by a conditional probability distribution, commonly denoted as q(xtxt1)q(xₜ | xₜ₋₁).

Mathematically, the forward process is defined as:

q(xtxt1)=N ⁣(xt;1βtxt1,  βtI)q(x_t \mid x_{t-1}) = \mathcal{N}\!\left( x_t;\, \sqrt{1 - \beta_t}\, x_{t-1},\; \beta_t I \right)

where:

  • xtxₜ is the noisy sample at time step t;
  • xt1xₜ₋₁ is the sample from the previous step;
  • βtβₜ is the variance schedule (a small positive scalar controlling the noise at each step);
  • II is the identity matrix.

This definition means that, given xt1xₜ₋₁, the next state xtxₜ is sampled from a normal distribution centered at (1βt)xt1\sqrt{(1 - βₜ)} xₜ₋₁ with variance βtβₜ. This Markov property ensures that each step only depends on the immediate previous state and not on the entire history.

The properties of q(xtxt1)q(xₜ | xₜ₋₁) are:

  • It is a Gaussian distribution for every tt;
  • The process is Markovian: the future state depends only on the present state;
  • The variance schedule βt{βₜ} determines the rate of noise addition;
  • As tt increases, the sample becomes progressively noisier, eventually approaching pure Gaussian noise as tt approaches the maximum diffusion step.

You can also derive the marginal distribution of the forward process, which expresses the distribution of xtxₜ given the original, clean data sample x0x₀ after tt steps of noise addition. This is useful because it allows you to sample noisy data at any step directly from x0x₀ without simulating the entire Markov chain step-by-step.

The marginal distribution is given by:

q(xtx0)=N ⁣(xt;αˉtx0,  (1αˉt)I)q(x_t \mid x_0) = \mathcal{N}\!\left( x_t;\, \sqrt{\bar{\alpha}_t}\, x_0,\; (1 - \bar{\alpha}_t)\, I \right)

where:

  • αt=1βtαₜ = 1 - βₜ;
  • αˉt=s=1tαsᾱₜ = ∏_{s=1}^t α_s is the cumulative product of the noise schedule up to time tt.

This form shows that, after tt steps, the noisy sample xtxₜ is still Gaussian, with its mean scaled by αˉt\sqrt{ᾱₜ} and its variance increased to (1αˉt)(1 - ᾱₜ). This cumulative effect of the noise schedule makes it possible to sample xtxₜ in a single step from x0x₀:

xt=αˉtx0  +  1αˉtε,εN(0,I)x_t = \sqrt{\bar{\alpha}_t}\, x_0 \;+\; \sqrt{1 - \bar{\alpha}_t}\, \varepsilon, \qquad \varepsilon \sim \mathcal{N}(0, I)

This property is central to efficient training and sampling in diffusion models.

question mark

What does it mean for the forward diffusion process to be Markovian?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 1

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Suggested prompts:

Can you explain how the variance schedule $$βₜ$$ is typically chosen in practice?

What is the intuition behind using a Markov chain for the forward process?

How does the marginal distribution help in training diffusion models?

Awesome!

Completion rate improved to 8.33

bookForward Process Definition

Svep för att visa menyn

To understand the forward diffusion process in diffusion models, you need to formalize how noise is gradually added to data in a controlled, stepwise manner. The process is defined as a Markov chain, where at each time step t, a small amount of Gaussian noise is added to the previous state. This stepwise corruption is governed by a conditional probability distribution, commonly denoted as q(xtxt1)q(xₜ | xₜ₋₁).

Mathematically, the forward process is defined as:

q(xtxt1)=N ⁣(xt;1βtxt1,  βtI)q(x_t \mid x_{t-1}) = \mathcal{N}\!\left( x_t;\, \sqrt{1 - \beta_t}\, x_{t-1},\; \beta_t I \right)

where:

  • xtxₜ is the noisy sample at time step t;
  • xt1xₜ₋₁ is the sample from the previous step;
  • βtβₜ is the variance schedule (a small positive scalar controlling the noise at each step);
  • II is the identity matrix.

This definition means that, given xt1xₜ₋₁, the next state xtxₜ is sampled from a normal distribution centered at (1βt)xt1\sqrt{(1 - βₜ)} xₜ₋₁ with variance βtβₜ. This Markov property ensures that each step only depends on the immediate previous state and not on the entire history.

The properties of q(xtxt1)q(xₜ | xₜ₋₁) are:

  • It is a Gaussian distribution for every tt;
  • The process is Markovian: the future state depends only on the present state;
  • The variance schedule βt{βₜ} determines the rate of noise addition;
  • As tt increases, the sample becomes progressively noisier, eventually approaching pure Gaussian noise as tt approaches the maximum diffusion step.

You can also derive the marginal distribution of the forward process, which expresses the distribution of xtxₜ given the original, clean data sample x0x₀ after tt steps of noise addition. This is useful because it allows you to sample noisy data at any step directly from x0x₀ without simulating the entire Markov chain step-by-step.

The marginal distribution is given by:

q(xtx0)=N ⁣(xt;αˉtx0,  (1αˉt)I)q(x_t \mid x_0) = \mathcal{N}\!\left( x_t;\, \sqrt{\bar{\alpha}_t}\, x_0,\; (1 - \bar{\alpha}_t)\, I \right)

where:

  • αt=1βtαₜ = 1 - βₜ;
  • αˉt=s=1tαsᾱₜ = ∏_{s=1}^t α_s is the cumulative product of the noise schedule up to time tt.

This form shows that, after tt steps, the noisy sample xtxₜ is still Gaussian, with its mean scaled by αˉt\sqrt{ᾱₜ} and its variance increased to (1αˉt)(1 - ᾱₜ). This cumulative effect of the noise schedule makes it possible to sample xtxₜ in a single step from x0x₀:

xt=αˉtx0  +  1αˉtε,εN(0,I)x_t = \sqrt{\bar{\alpha}_t}\, x_0 \;+\; \sqrt{1 - \bar{\alpha}_t}\, \varepsilon, \qquad \varepsilon \sim \mathcal{N}(0, I)

This property is central to efficient training and sampling in diffusion models.

question mark

What does it mean for the forward diffusion process to be Markovian?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 1
some-alt