Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Supervised Learning: Formal Setup | Foundations of Statistical Learning
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Statistical Learning Theory Foundations

bookSupervised Learning: Formal Setup

Supervised learning is a central paradigm in statistical learning theory, where the goal is to learn a mapping from inputs to outputs based on example data. In this framework, you work with an input space (often denoted as XX), which contains all possible instances or feature vectors, and an output space (YY), which contains all possible labels or responses. For example, in a classification problem, XX could be the set of all images represented as arrays of pixel values, and YY could be the set 0,1{0, 1} for binary classification.

To make predictions, you select a function from a hypothesis class (HH), which is a set of candidate functions that map elements from XX to YY. The choice of HH is crucial: it reflects your assumptions about the kind of relationships that might exist between inputs and outputs, and it determines what your learning algorithm can possibly discover.

Evaluating how well a hypothesis performs requires a loss function (LL). The loss function quantifies the cost of predicting h(x)h(x) when the true label is yy. For instance, in binary classification, a common loss is the 0-1 loss, defined as L(h(x),y)=1L(h(x), y) = 1 if h(x)β‰ yh(x) \neq y and 00 otherwise.

Note
Definition

Definition of Key Terms:

  • Instance space (XX): the set of all possible input objects or feature vectors;
  • Label space (YY): the set of all possible output labels or responses;
  • Hypothesis (hh): a function h:Xβ†’Yh: X \to Y from the hypothesis class HH that maps inputs to predicted outputs;
  • Risk: the expected loss of a hypothesis, measuring its average performance over the data distribution.

When training a supervised learning model, you typically do not have access to the entire data distribution, but only to a finite sample, called the training set. This leads to two important concepts for measuring the performance of a hypothesis: true risk and empirical risk. The true risk (also called the expected risk) of a hypothesis is the average loss it would incur over the entire (unknown) data distribution. In contrast, the empirical risk is the average loss computed over the training data you actually observe. While empirical risk can be calculated directly, true risk is what ultimately matters for generalization, but it is generally inaccessible. Understanding the relationship between these two quantities is a foundational concern in statistical learning theory.

question mark

Which of the following statements accurately describe concepts from supervised learning theory

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 1

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain the difference between true risk and empirical risk in more detail?

Why is the hypothesis class important in supervised learning?

What are some common loss functions used in supervised learning?

bookSupervised Learning: Formal Setup

Swipe to show menu

Supervised learning is a central paradigm in statistical learning theory, where the goal is to learn a mapping from inputs to outputs based on example data. In this framework, you work with an input space (often denoted as XX), which contains all possible instances or feature vectors, and an output space (YY), which contains all possible labels or responses. For example, in a classification problem, XX could be the set of all images represented as arrays of pixel values, and YY could be the set 0,1{0, 1} for binary classification.

To make predictions, you select a function from a hypothesis class (HH), which is a set of candidate functions that map elements from XX to YY. The choice of HH is crucial: it reflects your assumptions about the kind of relationships that might exist between inputs and outputs, and it determines what your learning algorithm can possibly discover.

Evaluating how well a hypothesis performs requires a loss function (LL). The loss function quantifies the cost of predicting h(x)h(x) when the true label is yy. For instance, in binary classification, a common loss is the 0-1 loss, defined as L(h(x),y)=1L(h(x), y) = 1 if h(x)β‰ yh(x) \neq y and 00 otherwise.

Note
Definition

Definition of Key Terms:

  • Instance space (XX): the set of all possible input objects or feature vectors;
  • Label space (YY): the set of all possible output labels or responses;
  • Hypothesis (hh): a function h:Xβ†’Yh: X \to Y from the hypothesis class HH that maps inputs to predicted outputs;
  • Risk: the expected loss of a hypothesis, measuring its average performance over the data distribution.

When training a supervised learning model, you typically do not have access to the entire data distribution, but only to a finite sample, called the training set. This leads to two important concepts for measuring the performance of a hypothesis: true risk and empirical risk. The true risk (also called the expected risk) of a hypothesis is the average loss it would incur over the entire (unknown) data distribution. In contrast, the empirical risk is the average loss computed over the training data you actually observe. While empirical risk can be calculated directly, true risk is what ultimately matters for generalization, but it is generally inaccessible. Understanding the relationship between these two quantities is a foundational concern in statistical learning theory.

question mark

Which of the following statements accurately describe concepts from supervised learning theory

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 1
some-alt