Lära Forward Process Definition | Mathematical Foundations of Diffusion Models

Svep för att visa menyn

To understand the forward diffusion process in diffusion models, you need to formalize how noise is gradually added to data in a controlled, stepwise manner. The process is defined as a Markov chain, where at each time step t, a small amount of Gaussian noise is added to the previous state. This stepwise corruption is governed by a conditional probability distribution, commonly denoted as $q(xₜ | xₜ₋₁)$ .

Mathematically, the forward process is defined as:

q(x_t \mid x_{t-1}) = \mathcal{N}\!\left( x_t;\, \sqrt{1 - \beta_t}\, x_{t-1},\; \beta_t I \right)

where:

$xₜ$ is the noisy sample at time step t;
$xₜ₋₁$ is the sample from the previous step;
$βₜ$ is the variance schedule (a small positive scalar controlling the noise at each step);
$I$ is the identity matrix.

This definition means that, given $xₜ₋₁$ , the next state $xₜ$ is sampled from a normal distribution centered at $\sqrt{(1 - βₜ)} xₜ₋₁$ with variance $βₜ$ . This Markov property ensures that each step only depends on the immediate previous state and not on the entire history.

The properties of $q(xₜ | xₜ₋₁)$ are:

It is a Gaussian distribution for every $t$ ;
The process is Markovian: the future state depends only on the present state;
The variance schedule ${βₜ}$ determines the rate of noise addition;
As $t$ increases, the sample becomes progressively noisier, eventually approaching pure Gaussian noise as $t$ approaches the maximum diffusion step.

You can also derive the marginal distribution of the forward process, which expresses the distribution of $xₜ$ given the original, clean data sample $x₀$ after $t$ steps of noise addition. This is useful because it allows you to sample noisy data at any step directly from $x₀$ without simulating the entire Markov chain step-by-step.

The marginal distribution is given by:

q(x_t \mid x_0) = \mathcal{N}\!\left( x_t;\, \sqrt{\bar{\alpha}_t}\, x_0,\; (1 - \bar{\alpha}_t)\, I \right)

where:

$αₜ = 1 - βₜ$ ;
$ᾱₜ = ∏_{s=1}^t α_s$ is the cumulative product of the noise schedule up to time $t$ .

This form shows that, after $t$ steps, the noisy sample $xₜ$ is still Gaussian, with its mean scaled by $\sqrt{ᾱₜ}$ and its variance increased to $(1 - ᾱₜ)$ . This cumulative effect of the noise schedule makes it possible to sample $xₜ$ in a single step from $x₀$ :

x_t = \sqrt{\bar{\alpha}_t}\, x_0 \;+\; \sqrt{1 - \bar{\alpha}_t}\, \varepsilon, \qquad \varepsilon \sim \mathcal{N}(0, I)

This property is central to efficient training and sampling in diffusion models.

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 1

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 2. Kapitel 1