Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Building the Regression Model Based on Several Variables | Multivariate Linear Regression
Explore the Linear Regression Using Python
course content

Course Content

Explore the Linear Regression Using Python

Explore the Linear Regression Using Python

1. What is the Linear Regression?
2. Correlation
3. Building and Training Model
4. Metrics to Evaluate the Model
5. Multivariate Linear Regression

bookBuilding the Regression Model Based on Several Variables

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong class=&quot;go98639658&quot;&gt;a&lt;/strong&gt;</em> + <em>&lt;strong class=&quot;go98639658&quot;&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong class=&quot;go98639658&quot;&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong class=&quot;go98639658&quot;&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong class=&quot;go98639658&quot;&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong class=&quot;go98639658&quot;&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Task

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 5. Chapter 1
toggle bottom row

bookBuilding the Regression Model Based on Several Variables

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong class=&quot;go98639658&quot;&gt;a&lt;/strong&gt;</em> + <em>&lt;strong class=&quot;go98639658&quot;&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong class=&quot;go98639658&quot;&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong class=&quot;go98639658&quot;&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong class=&quot;go98639658&quot;&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong class=&quot;go98639658&quot;&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Task

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 5. Chapter 1
toggle bottom row

bookBuilding the Regression Model Based on Several Variables

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong class=&quot;go98639658&quot;&gt;a&lt;/strong&gt;</em> + <em>&lt;strong class=&quot;go98639658&quot;&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong class=&quot;go98639658&quot;&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong class=&quot;go98639658&quot;&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong class=&quot;go98639658&quot;&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong class=&quot;go98639658&quot;&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Task

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong class=&quot;go98639658&quot;&gt;a&lt;/strong&gt;</em> + <em>&lt;strong class=&quot;go98639658&quot;&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong class=&quot;go98639658&quot;&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong class=&quot;go98639658&quot;&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong class=&quot;go98639658&quot;&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong class=&quot;go98639658&quot;&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Task

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Section 5. Chapter 1
Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
some-alt