Swipe to show menu

Building the Regression Model Based on Several Variables

Often we will need to make predictions based not on one but on several characteristics at once. For example, predict the length of a cat's tail knowing its weight and height. In such a case our equation looks like this:

tail_length = a + b * cat_weight + c * cat_height

Here we need to find we have to find already three unknown variables (the intercept a, coefficients b and c). The more features we have from which we make our forecast, the more unknown variables. In this case, we will fit a plane instead of a line.

So multivariant regression is the way to predict a value based on two or more variables. Let’s indicate the number of flavanoids based on the number of total phenols and nonflavanoid phenols. Dealing with these two characteristics we will use methods from the previous section to find the best model and get missing parameters:


              12345678910
            
# Preparing data
X2 = data[['total_phenols','nonflavanoid_phenols']]
Y = data['flavanoids']

# Train and test model
X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1)
model2 = LinearRegression()
model2.fit(X2_train, Y_train)
print(model2.intercept_)
print(model2.coef_)

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:


              1
            
y_test_predicted2 = model2.predict(X2_test)

This code will return predicted values using the trained model:

flavanoids = 2.1 - 0.74 * total_phenols + 1.46 * nonflavanoid_phenols

Task

Swipe to start coding

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

[Line #17] Set total_phenols as our target.
[Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
[Line #25] Initialize linear regression model.
[Line #26] Fit the model using your tain data.
[Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Solution

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Thanks for your feedback!

Section 5. Chapter 1

single

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Building the Regression Model Based on Several Variables

tail_length = a + b * cat_weight + c * cat_height


              12345678910
            
# Preparing data
X2 = data[['total_phenols','nonflavanoid_phenols']]
Y = data['flavanoids']

# Train and test model
X2_train, X2_test, Y_train, Y_test = train_test_split(X2, Y, test_size = 0.3, random_state = 1)
model2 = LinearRegression()
model2.fit(X2_train, Y_train)
print(model2.intercept_)
print(model2.coef_)

We use in code the sane random_state parameter to get the same split.

The method of obtaining forecasts is identical to what we considered in the previous chapter for one characteristic:


              1
            
y_test_predicted2 = model2.predict(X2_test)

This code will return predicted values using the trained model:

flavanoids = 2.1 - 0.74 * total_phenols + 1.46 * nonflavanoid_phenols

Task

Swipe to start coding

Let’s indicate the number of total phenols based on the number of flavanoids and nonflavanoid phenols.

[Line #17] Set total_phenols as our target.
[Line #24] Split the data 70-30 (70% of the data is for training and 30% is for testing) and insert 1 as a random parameter.
[Line #25] Initialize linear regression model.
[Line #26] Fit the model using your tain data.
[Lines #29-30] Print model’s parameters (model2.intercept_ and model2.coef_) using the function print() twice.

Solution

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Thanks for your feedback!

Swipe to show menu

Building the Regression Model Based on Several Variables

Solution

Awesome!

Building the Regression Model Based on Several Variables

Solution

Awesome!