Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Building the Regression Model Based on Several Variables | Multivariate Linear Regression
Explore the Linear Regression Using Python
course content

Conteúdo do Curso

Explore the Linear Regression Using Python

Explore the Linear Regression Using Python

1. What is the Linear Regression?
2. Correlation
3. Building and Training Model
4. Metrics to Evaluate the Model
5. Multivariate Linear Regression

Building the Regression Model Based on Several Variables

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong&gt;a&lt;/strong&gt;</em> + <em>&lt;strong&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Tarefa

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Tarefa

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Seção 5. Capítulo 1
toggle bottom row

Building the Regression Model Based on Several Variables

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong&gt;a&lt;/strong&gt;</em> + <em>&lt;strong&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Tarefa

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Tarefa

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Seção 5. Capítulo 1
toggle bottom row

Building the Regression Model Based on Several Variables

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong&gt;a&lt;/strong&gt;</em> + <em>&lt;strong&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Tarefa

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Tarefa

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = <em>&lt;strong&gt;a&lt;/strong&gt;</em> + <em>&lt;strong&gt;b&lt;/strong&gt;</em> * cat_weight + <em>&lt;strong&gt;c&lt;/strong&gt;</em> * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:

12345678910
# Preparing data X2 = data[['total_phenols','nonflavanoid_phenols']] Y = data['flavanoids'] # Train and test model X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1) model2 = LinearRegression() model2.fit(X2_train, Y_train) print(model2.intercept_) print(model2.coef_)
copy

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:

1
y_test_predicted2 = model2.predict(X2_test)
copy

This code will return predicted values using the trained model:

flavanoids = <em>&lt;strong&gt;2.1&lt;/strong&gt;</em> - <em>&lt;strong&gt;0.74&lt;/strong&gt;</em> * total_phenols + <em>&lt;strong&gt;1.46&lt;/strong&gt;</em> * nonflavanoid_phenols

Tarefa

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

  1. [Line #17] Set total_phenols as our target.
  2. [Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
  3. [Line #25] Initialize linear regression model.
  4. [Line #26] Fit the model using your tain data.
  5. [Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
Seção 5. Capítulo 1
Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
We're sorry to hear that something went wrong. What happened?
some-alt