Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Training Set | Machine Learning Concepts
ML Introduction with scikit-learn
course content

Course Content

ML Introduction with scikit-learn

ML Introduction with scikit-learn

1. Machine Learning Concepts
2. Preprocessing Data with Scikit-learn
3. Pipelines
4. Modeling

book
Training Set

If we talk about supervised or unsupervised learning, the training set will usually be in a table form.

Consider the diabetes dataset, which has the task of predicting whether a person has diabetes. It holds information about 768 females with parameters like age, body mass index, blood pressure, etc. These parameters are called features.

The dataset also contains information on whether the person has diabetes in an 'Outcome' column, which is what we want to predict. It is called target.

Each row in a table is called instance(or data point or sample). In this case, it is information about one female.

The table (training set) has a target column in it, which means it is labeled.

The task is to train the ML model on this training set, and once it is trained, it can predict for other people (new instances) whether they have diabetes based on features only.

While coding, feature columns are usually assigned to X and target columns assigned as y.

And features of new instances are assigned as X_new.

question-icon

Match the variable names with the data they usually hold.

X –
y –

X_new –

Click or drag`n`drop items and fill in the blanks

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 3
We're sorry to hear that something went wrong. What happened?
some-alt