Kurssisisältö
Generative AI
Generative AI
Bayesian Inference and Markov Processes
Understanding Bayesian Inference in AI
What is Bayesian Inference?
Bayesian inference is a statistical method used to update probabilities based on new evidence. AI systems use Bayesian inference to refine their predictions as they gather more data.
Imagine you’re predicting the weather. If it's usually sunny in your city but you see dark clouds forming, you adjust your expectation and predict rain. This is how Bayesian inference works—starting with an initial belief (prior), incorporating new data, and updating the belief accordingly.
where:
- P( H | D ) is the posterior probability, the updated probability of hypothesis H given data D ;
- P( D | H ) is the likelihood, representing how well hypothesis H explains data D ;
- P( H ) is the prior probability, the initial belief before observing D ;
- P( D ) is the marginal likelihood, acting as a normalizing constant.
Problem Statement: An AI spam filter uses Bayesian classification.
- 20% of emails are spam (P(Spam) = 0.2);
- 80% of emails are not spam (P(Not Spam) = 0.8);
- 90% of spam emails contain the word “urgent” (P(Urgent | Spam) = 0.9);
- 10% of regular emails contain the word “urgent” (P(Urgent | Not Spam) = 0.1).
Question:
If an email contains the word "urgent", what is the probability that it is spam (P(Spam | Urgent))?
Markov Processes: Predicting the Future
What is a Markov Chain?
A Markov chain is a mathematical model where the next state depends only on the current state and not on the previous ones. It is widely used in AI to model sequential data and decision-making processes. Here are the key formulas used in Markov processes:
1. Transition Probability Formula
The probability of a system being in state Sj at time t given its previous state Si at time t - 1 :
where Tij is the transition probability from state Si to Sj ;
2. State Probability Update
The probability distribution over states at time t :
where:
- Pt is the state probability at time t .
- Pt-1 is the state probability at time t-1 .
- T is the transition matrix.
3. Steady-State Probability (Long-term Behavior)
For a long-running Markov process, the steady-state probability Ps satisfies:
This equation is solved to find the equilibrium distribution where probabilities do not change over time.
Problem Statement: In a certain city, the weather transitions between Sunny and Rainy days. The probability of transitioning between these states is given by the following transition matrix:
Where:
- 0.7 is the probability that after Sunny day become Sunny day again;
- 0.3 is the probability that a Sunny day turns Rainy;
- 0.6 is the probability that a Rainy day turns Sunny;
- 0.4 probability that after Rainy day become Rainy day again.
If today’s weather is Sunny, what is the probability that it will be Rainy in two days?
Markov Decision Processes (MDPs): Teaching AI to Make Decisions
MDPs extend Markov chains by introducing actions and rewards, allowing AI to make optimal decisions instead of just predicting states.
Example: A Robot in a Maze
A robot navigating a maze learns which paths lead to the exit by considering:
- Actions: moving left, right, up, or down;
- Rewards: successfully reaching the goal, hitting a wall, or encountering an obstacle;
- Optimal Strategy: choosing actions that maximize the reward.
MDPs are widely used in game AI, robotics, and recommendation systems to optimize decision-making.
Hidden Markov Models (HMMs): Understanding Unseen Patterns
An HMM is a Markov model where some states are hidden, and AI must infer them based on observed data.
Example: Speech Recognition
When you speak to Siri or Alexa, AI doesn’t directly see the words. Instead, it processes sound waves and tries to determine the most probable sequence of words.
HMMs are essential in:
- Speech and Text Recognition: AI deciphers spoken language and handwriting;
- Stock Market Predictions: AI models hidden trends to forecast market fluctuations;
- Robotics and Gaming: AI-controlled agents infer hidden states from observable events.
Conclusion
Bayesian inference provides a rigorous way to update beliefs in AI models, while Markov processes offer powerful tools for modeling sequential dependencies. These principles underpin key generative AI applications, including reinforcement learning, probabilistic graphical models, and structured sequence generation.
1. What is the primary role of Bayesian inference in AI?
2. In a Markov Decision Process, what does an AI consider when making a decision?
3. Which of the following is an application of Hidden Markov Models?
Kiitos palautteestasi!