Classifier Guidance & Classifier-Free Guidance
Classifier guidance is a technique that allows you to steer the generation process of diffusion models toward samples that are more likely to belong to a desired class. During the reverse diffusion process, you want the model not only to denoise the sample but also to make it more probable under a target class, as determined by a classifier. Mathematically, this is achieved by modifying the reverse process so that the transition kernel incorporates both the learned score (the gradient of the log probability of the data) and the gradient of the log probability of the class given the current sample. If s(x,t) is the score function from the diffusion model and C(y∣x) is a classifier producing the probability of class y given sample x, then the guided score combines both of these components to influence sampling.
The guided score is computed as a classifier-guided score used in diffusion models. It takes the base score function s(x,t), adds the gradient of the log-probability of a target class, and scales this guidance by a weight w.
- s(x,t): the model's score function
- ∇xlog(p(y∣x)): the gradient of the classifier's log-probability
- w: the guidance strength
The combined guided score is:
guided score=s(x,t)+w⋅∇xlogp(y∣x)This approach uses the classifier's gradient to push the sample toward the target class at each denoising step. In practice, you pretrain a classifier on noisy images at various timesteps and use its gradients during sampling. This allows you to "guide" the diffusion process toward generating images of a specific class without retraining the diffusion model itself.
Classifier-free guidance offers an alternative that avoids dependence on a separate classifier network. Instead, you train the diffusion model itself to predict both conditional and unconditional outputs. This is done by randomly dropping the conditioning label during training, so the model learns to generate both with and without class information. At sampling time, you can interpolate between the unconditional and conditional outputs to guide the sample toward the desired class. The main advantages of classifier-free guidance are:
- You avoid the cost and complexity of training a separate classifier;
- The approach is more robust because it does not rely on classifier accuracy at high noise levels;
- It allows for flexible, tunable guidance strength without extra models.
To understand how guided sampling works in diffusion models, imagine you want to generate images of cats using a diffusion model trained on various animals. Without guidance, the model samples from the overall distribution, producing a mix of different animals. With classifier guidance, at each denoising step, you use the gradient from a classifier to nudge the sample toward the "cat" region of the data space. With classifier-free guidance, the model itself provides both a generic and a "cat"-specific prediction at each step, and you blend these using a guidance scale. In both cases, guided sampling allows you to shape the generation process, increasing the likelihood of producing the desired class while still leveraging the generative power of the diffusion model.
¡Gracias por tus comentarios!
Pregunte a AI
Pregunte a AI
Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla
Can you explain the main differences between classifier guidance and classifier-free guidance?
How does the guidance strength parameter affect the generated samples?
Can you give an example of how to implement classifier-free guidance in practice?
Awesome!
Completion rate improved to 8.33
Classifier Guidance & Classifier-Free Guidance
Desliza para mostrar el menú
Classifier guidance is a technique that allows you to steer the generation process of diffusion models toward samples that are more likely to belong to a desired class. During the reverse diffusion process, you want the model not only to denoise the sample but also to make it more probable under a target class, as determined by a classifier. Mathematically, this is achieved by modifying the reverse process so that the transition kernel incorporates both the learned score (the gradient of the log probability of the data) and the gradient of the log probability of the class given the current sample. If s(x,t) is the score function from the diffusion model and C(y∣x) is a classifier producing the probability of class y given sample x, then the guided score combines both of these components to influence sampling.
The guided score is computed as a classifier-guided score used in diffusion models. It takes the base score function s(x,t), adds the gradient of the log-probability of a target class, and scales this guidance by a weight w.
- s(x,t): the model's score function
- ∇xlog(p(y∣x)): the gradient of the classifier's log-probability
- w: the guidance strength
The combined guided score is:
guided score=s(x,t)+w⋅∇xlogp(y∣x)This approach uses the classifier's gradient to push the sample toward the target class at each denoising step. In practice, you pretrain a classifier on noisy images at various timesteps and use its gradients during sampling. This allows you to "guide" the diffusion process toward generating images of a specific class without retraining the diffusion model itself.
Classifier-free guidance offers an alternative that avoids dependence on a separate classifier network. Instead, you train the diffusion model itself to predict both conditional and unconditional outputs. This is done by randomly dropping the conditioning label during training, so the model learns to generate both with and without class information. At sampling time, you can interpolate between the unconditional and conditional outputs to guide the sample toward the desired class. The main advantages of classifier-free guidance are:
- You avoid the cost and complexity of training a separate classifier;
- The approach is more robust because it does not rely on classifier accuracy at high noise levels;
- It allows for flexible, tunable guidance strength without extra models.
To understand how guided sampling works in diffusion models, imagine you want to generate images of cats using a diffusion model trained on various animals. Without guidance, the model samples from the overall distribution, producing a mix of different animals. With classifier guidance, at each denoising step, you use the gradient from a classifier to nudge the sample toward the "cat" region of the data space. With classifier-free guidance, the model itself provides both a generic and a "cat"-specific prediction at each step, and you blend these using a guidance scale. In both cases, guided sampling allows you to shape the generation process, increasing the likelihood of producing the desired class while still leveraging the generative power of the diffusion model.
¡Gracias por tus comentarios!