Kurssisisältö

Generative AI

1. Introduction to Generative AI

What is Generative AI?History and Evolution Types of Generative AI Models

2. Theoretical Foundations

Probability Distributions and Randomness in AI Bayesian Inference and Markov Processes Understanding Information and Optimization in AI Overview of Artificial Neural Networks Recurrent Neural Networks (RNNs) and Sequence Generation Variational Autoencoders (VAEs)Generative Adversarial Networks (GANs)Transformer-Based Generative Models Diffusion Models and Probabilistic Generative Approaches

3. Building and Training Generative Models

Data Collection and Preprocessing Training and Optimization Evaluation Metrics for Generative AI Challenge: Build Simple VAE

4. Ethical, Regulatory, and Future Perspectives in Generative AI

Bias, Fairness, and Representation Deepfakes and Misinformation Intellectual Property and Ownership Sustainability and Scaling Challenges Global Policy and AI Governance

Probability Distributions and Randomness in AI

Probability distributions and randomness are at the core of generative models, allowing AI systems to create diverse and realistic outputs. Instead of explicitly defining probability theory, this chapter focuses on how probability is used in Generative AI to model uncertainty, sample data, and train generative models.

Role of Probability Distributions in Generative AI

Generative models rely on probability distributions to learn data patterns and generate new samples. The key ideas include:

Latent Space Representation: many generative models (e.g., VAEs, GANs) map input data into a lower-dimensional probability distribution. Sampling from this distribution generates new data points;
Likelihood Estimation: probabilistic models estimate the likelihood of observing a data point given a learned distribution, guiding training;
Sampling and Generation: the process of drawing random samples from learned distributions to create new synthetic data.

Key Mathematical Concepts:

For a probability distribution $p(x)$ , the likelihood of data $X$ given model parameters $\theta$ is:

\mathcal{L}(\theta|X)= \prod^{N}_{i=1}p(x_i|\theta)

Maximizing this likelihood helps generative models learn patterns from data. In generative AI, models often assume specific forms of probability distributions—such as Gaussian, Bernoulli, or categorical—to represent data. The choice of distribution affects how models learn and generate new samples. For example, in text generation, categorical distributions are used to model the probability of each possible word given prior words.

Randomness and Noise in Generative Models

Noise plays a crucial role in Generative AI, ensuring diversity and improving robustness:

Latent Noise in GANs: in GANs, a noise vector $z \sim p(x)$ (often sampled from a Gaussian or Uniform distribution) is transformed into realistic samples through the generator. This randomness ensures variation in generated images;
Variational Inference in VAEs: VAEs introduce Gaussian noise in latent space, allowing smooth interpolation between generated samples. This ensures that minor changes in latent variables result in meaningful variations in outputs;
Diffusion Models and Stochastic Processes: These models learn to reverse a gradual noise addition process to generate high-quality data. By iteratively refining noisy inputs, they can generate complex, realistic images.

Example: Gaussian Latent Space in VAEs

In VAEs, the encoder outputs parameters for a Gaussian distribution:

q(z|x)=\mathcal{N}(z;\mu(x),\sigma^2(x))

Instead of using a deterministic mapping, VAEs sample from this distribution, introducing controlled randomness that enables diverse generation. This technique allows VAEs to generate new faces by interpolating between different latent space representations.

Sampling Methods in Generative AI

Sampling techniques are essential for generating new data points from learned distributions:

Monte Carlo Sampling: used in probabilistic models like Bayesian inference to approximate expectations. Monte Carlo integration estimates an expectation as:

\mathbb{E}[f(X)]\approx \frac{1}{N}\sum^N_{i=1}f(X_i)

where $X_i$ are sampled from the target distribution.

Reparameterization Trick: in VAEs, ensures gradient flow through stochastic nodes by expressing $z$ as:

z=\mu + \sigma \cdot \varepsilon,\ \varepsilon \sim \mathcal{N}(0, 1)

This trick allows efficient backpropagation through stochastic layers.

Ancestral Sampling: in autoregressive models (e.g., GPT), samples are generated sequentially based on conditional probabilities. For example, when generating text, a model predicts the next word given the previous ones:

p(x_t|x_1, x_2, \ldots,x_{t-1})

This sequential process ensures coherence in generated text.

Example: Ancestral Sampling in Text Generation

Suppose we train a generative model to generate English sentences. Given the input "The cat", the model samples the next word from a learned probability distribution, producing outputs like:

"The cat sleeps."
"The cat jumps."
"The cat is hungry."

Each next-word prediction depends on previously generated words, creating meaningful sequences.

Practical Applications in Generative AI

GANs: use noise vectors to generate high-resolution images;
VAEs: encode data into a probability distribution for smooth latent space interpolation;
Diffusion Models: use stochastic noise removal to iteratively generate images;
Bayesian Generative Models: model uncertainty in generative tasks.

Conclusion

Probability and randomness are the foundation of Generative AI, enabling models to learn distributions, generate diverse outputs, and approximate real-world variability. The next chapters will build on these concepts to explore probabilistic modeling, neural networks, and generative architectures.

1. Which of the following is an example of a probability distribution used in Generative AI?

2. In Variational Autoencoders (VAEs), what role does noise play?

3. Which sampling method is commonly used in generative AI models like GPT?

Which of the following is an example of a probability distribution used in Generative AI?

Select the correct answer

Uniform distribution

Gaussian distribution

Bernoulli distribution

All of the above

In Variational Autoencoders (VAEs), what role does noise play?

Select the correct answer

It disrupts the learning process

It helps generate diverse samples

It prevents overfitting completely

It is used to eliminate latent variables

Which sampling method is commonly used in generative AI models like GPT?

Select the correct answer

Monte Carlo Sampling

Reparameterization Trick

Ancestral Sampling

Diffusion Sampling

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 2. Luku 1

Kysy tekoälyä

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

Kurssisisältö

Generative AI

1. Introduction to Generative AI

What is Generative AI?History and Evolution Types of Generative AI Models

2. Theoretical Foundations

3. Building and Training Generative Models

Data Collection and Preprocessing Training and Optimization Evaluation Metrics for Generative AI Challenge: Build Simple VAE

4. Ethical, Regulatory, and Future Perspectives in Generative AI

Bias, Fairness, and Representation Deepfakes and Misinformation Intellectual Property and Ownership Sustainability and Scaling Challenges Global Policy and AI Governance

Probability Distributions and Randomness in AI

Role of Probability Distributions in Generative AI

Generative models rely on probability distributions to learn data patterns and generate new samples. The key ideas include:

Latent Space Representation: many generative models (e.g., VAEs, GANs) map input data into a lower-dimensional probability distribution. Sampling from this distribution generates new data points;
Likelihood Estimation: probabilistic models estimate the likelihood of observing a data point given a learned distribution, guiding training;
Sampling and Generation: the process of drawing random samples from learned distributions to create new synthetic data.

Key Mathematical Concepts:

For a probability distribution $p(x)$ , the likelihood of data $X$ given model parameters $\theta$ is:

\mathcal{L}(\theta|X)= \prod^{N}_{i=1}p(x_i|\theta)

Randomness and Noise in Generative Models

Noise plays a crucial role in Generative AI, ensuring diversity and improving robustness:

Latent Noise in GANs: in GANs, a noise vector $z \sim p(x)$ (often sampled from a Gaussian or Uniform distribution) is transformed into realistic samples through the generator. This randomness ensures variation in generated images;
Variational Inference in VAEs: VAEs introduce Gaussian noise in latent space, allowing smooth interpolation between generated samples. This ensures that minor changes in latent variables result in meaningful variations in outputs;
Diffusion Models and Stochastic Processes: These models learn to reverse a gradual noise addition process to generate high-quality data. By iteratively refining noisy inputs, they can generate complex, realistic images.

Example: Gaussian Latent Space in VAEs

In VAEs, the encoder outputs parameters for a Gaussian distribution:

q(z|x)=\mathcal{N}(z;\mu(x),\sigma^2(x))

Sampling Methods in Generative AI

Sampling techniques are essential for generating new data points from learned distributions:

Monte Carlo Sampling: used in probabilistic models like Bayesian inference to approximate expectations. Monte Carlo integration estimates an expectation as:

\mathbb{E}[f(X)]\approx \frac{1}{N}\sum^N_{i=1}f(X_i)

where $X_i$ are sampled from the target distribution.

Reparameterization Trick: in VAEs, ensures gradient flow through stochastic nodes by expressing $z$ as:

z=\mu + \sigma \cdot \varepsilon,\ \varepsilon \sim \mathcal{N}(0, 1)

This trick allows efficient backpropagation through stochastic layers.

Ancestral Sampling: in autoregressive models (e.g., GPT), samples are generated sequentially based on conditional probabilities. For example, when generating text, a model predicts the next word given the previous ones:

p(x_t|x_1, x_2, \ldots,x_{t-1})

This sequential process ensures coherence in generated text.

Example: Ancestral Sampling in Text Generation

Suppose we train a generative model to generate English sentences. Given the input "The cat", the model samples the next word from a learned probability distribution, producing outputs like:

"The cat sleeps."
"The cat jumps."
"The cat is hungry."

Each next-word prediction depends on previously generated words, creating meaningful sequences.

Practical Applications in Generative AI

GANs: use noise vectors to generate high-resolution images;
VAEs: encode data into a probability distribution for smooth latent space interpolation;
Diffusion Models: use stochastic noise removal to iteratively generate images;
Bayesian Generative Models: model uncertainty in generative tasks.

Conclusion

1. Which of the following is an example of a probability distribution used in Generative AI?

2. In Variational Autoencoders (VAEs), what role does noise play?

3. Which sampling method is commonly used in generative AI models like GPT?

Which of the following is an example of a probability distribution used in Generative AI?

Select the correct answer

Uniform distribution

Gaussian distribution

Bernoulli distribution

All of the above

In Variational Autoencoders (VAEs), what role does noise play?

Select the correct answer

It disrupts the learning process

It helps generate diverse samples

It prevents overfitting completely

It is used to eliminate latent variables

Which sampling method is commonly used in generative AI models like GPT?

Select the correct answer

Monte Carlo Sampling

Reparameterization Trick

Ancestral Sampling

Diffusion Sampling

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 2. Luku 1