Course Content
Explore the Linear Regression Using Python
Explore the Linear Regression Using Python
Working with Dataset
First of all, we need data on which we will work on. Scikit-learn comes with a few small standard datasets that do not require downloading and are very helpful for learning new models in machine learning. In this course, we will become high-class sommeliers and determine the quality of wine using statistics and regression. Wine recognition dataset provides a large variety of characteristics of wine: Alcohol, Ash, Magnesium, Total phenols, Color intensity, and so on.
from sklearn.datasets import load_wine wine = load_wine()
We also create pandas DataFrame for easier manipulation:
# Import the libraries import pandas as pd # Show all features pd.set_option('display.max_rows', None, 'display.max_columns', None) # Create DataFrame data = pd.DataFrame(data = wine['data'], columns = wine['feature_names'])
If you don't feel comfortable working with
pandas
library, check out our course on this topic.
To inspect our data, we should know the number of records and which columns it contains. In such situations, it would be better to use functions .shape
and .columns
. The first one returns the number of records and columns, and the second lets us know all columns' names.
#Get the number of records and columns print(data.shape) # Get the names of all columns print(data.columns)
Okay, so now we know how to load our dataset and get some information about data. But what if we want to get a certain number of records from our wine database? Pulling all information can be very inconvenient, primarily if we work with Big Data in the future where millions of records can be stored. To see the first rows of DataFrame use .head(n)
, where n is the number of rows to be selected. By default, it’s 5 rows.
Look at the following example. This code shows the first 5 rows of our dataset:
# Print first 5 rows print(data.head())
To sum up the learned functions and their usage below:
Task
Let’s explore our dataset. Using the functions discussed in this chapter, find out the number of records and columns that are in our set and the names of these columns. Print the first 9 rows of the wine dataset.
- [Lines #2-3] Import the library
pandas
and load our wine dataset. - [Line #12] Set DataFrame using the function
pd.DataFrame()
with parameters. - [Lines #15-19] Use functions
.shape
,.columns
to get information about records. Print the first 9 rows using functionhead(n)
where n is 9.
Thanks for your feedback!
Working with Dataset
First of all, we need data on which we will work on. Scikit-learn comes with a few small standard datasets that do not require downloading and are very helpful for learning new models in machine learning. In this course, we will become high-class sommeliers and determine the quality of wine using statistics and regression. Wine recognition dataset provides a large variety of characteristics of wine: Alcohol, Ash, Magnesium, Total phenols, Color intensity, and so on.
from sklearn.datasets import load_wine wine = load_wine()
We also create pandas DataFrame for easier manipulation:
# Import the libraries import pandas as pd # Show all features pd.set_option('display.max_rows', None, 'display.max_columns', None) # Create DataFrame data = pd.DataFrame(data = wine['data'], columns = wine['feature_names'])
If you don't feel comfortable working with
pandas
library, check out our course on this topic.
To inspect our data, we should know the number of records and which columns it contains. In such situations, it would be better to use functions .shape
and .columns
. The first one returns the number of records and columns, and the second lets us know all columns' names.
#Get the number of records and columns print(data.shape) # Get the names of all columns print(data.columns)
Okay, so now we know how to load our dataset and get some information about data. But what if we want to get a certain number of records from our wine database? Pulling all information can be very inconvenient, primarily if we work with Big Data in the future where millions of records can be stored. To see the first rows of DataFrame use .head(n)
, where n is the number of rows to be selected. By default, it’s 5 rows.
Look at the following example. This code shows the first 5 rows of our dataset:
# Print first 5 rows print(data.head())
To sum up the learned functions and their usage below:
Task
Let’s explore our dataset. Using the functions discussed in this chapter, find out the number of records and columns that are in our set and the names of these columns. Print the first 9 rows of the wine dataset.
- [Lines #2-3] Import the library
pandas
and load our wine dataset. - [Line #12] Set DataFrame using the function
pd.DataFrame()
with parameters. - [Lines #15-19] Use functions
.shape
,.columns
to get information about records. Print the first 9 rows using functionhead(n)
where n is 9.
Thanks for your feedback!
Working with Dataset
First of all, we need data on which we will work on. Scikit-learn comes with a few small standard datasets that do not require downloading and are very helpful for learning new models in machine learning. In this course, we will become high-class sommeliers and determine the quality of wine using statistics and regression. Wine recognition dataset provides a large variety of characteristics of wine: Alcohol, Ash, Magnesium, Total phenols, Color intensity, and so on.
from sklearn.datasets import load_wine wine = load_wine()
We also create pandas DataFrame for easier manipulation:
# Import the libraries import pandas as pd # Show all features pd.set_option('display.max_rows', None, 'display.max_columns', None) # Create DataFrame data = pd.DataFrame(data = wine['data'], columns = wine['feature_names'])
If you don't feel comfortable working with
pandas
library, check out our course on this topic.
To inspect our data, we should know the number of records and which columns it contains. In such situations, it would be better to use functions .shape
and .columns
. The first one returns the number of records and columns, and the second lets us know all columns' names.
#Get the number of records and columns print(data.shape) # Get the names of all columns print(data.columns)
Okay, so now we know how to load our dataset and get some information about data. But what if we want to get a certain number of records from our wine database? Pulling all information can be very inconvenient, primarily if we work with Big Data in the future where millions of records can be stored. To see the first rows of DataFrame use .head(n)
, where n is the number of rows to be selected. By default, it’s 5 rows.
Look at the following example. This code shows the first 5 rows of our dataset:
# Print first 5 rows print(data.head())
To sum up the learned functions and their usage below:
Task
Let’s explore our dataset. Using the functions discussed in this chapter, find out the number of records and columns that are in our set and the names of these columns. Print the first 9 rows of the wine dataset.
- [Lines #2-3] Import the library
pandas
and load our wine dataset. - [Line #12] Set DataFrame using the function
pd.DataFrame()
with parameters. - [Lines #15-19] Use functions
.shape
,.columns
to get information about records. Print the first 9 rows using functionhead(n)
where n is 9.
Thanks for your feedback!
First of all, we need data on which we will work on. Scikit-learn comes with a few small standard datasets that do not require downloading and are very helpful for learning new models in machine learning. In this course, we will become high-class sommeliers and determine the quality of wine using statistics and regression. Wine recognition dataset provides a large variety of characteristics of wine: Alcohol, Ash, Magnesium, Total phenols, Color intensity, and so on.
from sklearn.datasets import load_wine wine = load_wine()
We also create pandas DataFrame for easier manipulation:
# Import the libraries import pandas as pd # Show all features pd.set_option('display.max_rows', None, 'display.max_columns', None) # Create DataFrame data = pd.DataFrame(data = wine['data'], columns = wine['feature_names'])
If you don't feel comfortable working with
pandas
library, check out our course on this topic.
To inspect our data, we should know the number of records and which columns it contains. In such situations, it would be better to use functions .shape
and .columns
. The first one returns the number of records and columns, and the second lets us know all columns' names.
#Get the number of records and columns print(data.shape) # Get the names of all columns print(data.columns)
Okay, so now we know how to load our dataset and get some information about data. But what if we want to get a certain number of records from our wine database? Pulling all information can be very inconvenient, primarily if we work with Big Data in the future where millions of records can be stored. To see the first rows of DataFrame use .head(n)
, where n is the number of rows to be selected. By default, it’s 5 rows.
Look at the following example. This code shows the first 5 rows of our dataset:
# Print first 5 rows print(data.head())
To sum up the learned functions and their usage below:
Task
Let’s explore our dataset. Using the functions discussed in this chapter, find out the number of records and columns that are in our set and the names of these columns. Print the first 9 rows of the wine dataset.
- [Lines #2-3] Import the library
pandas
and load our wine dataset. - [Line #12] Set DataFrame using the function
pd.DataFrame()
with parameters. - [Lines #15-19] Use functions
.shape
,.columns
to get information about records. Print the first 9 rows using functionhead(n)
where n is 9.