Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Building Multiple Linear Regression | Multiple Linear Regression
Linear Regression for ML
course content

Course Content

Linear Regression for ML

Linear Regression for ML

1. Simple Linear Regression
2. Multiple Linear Regression
3. Polynomial Regression
4. Evaluating and Comparing Models

bookBuilding Multiple Linear Regression

Building a Multiple Linear Regression is as easy as building a Simple Linear Regression with scikit-learn!
You need to use the same LinearRegression class.

The only difference is that the .coef_ attribute now returns an array of β₁ to βₙ parameters.
Also, X and X_new now hold more than one column.

Looking at data

12345
import pandas as pd file_link='https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/heights_two_feature.csv' df = pd.read_csv(file_link) # Open the file print(df.head()) # Printing the first 5 rows of df
copy

Here 'Height' is the target, and 'Father', 'Mother' are the features.
So we will assign 'Father' and 'Mother' columns as the X and 'Height' as the y.

1234567
import pandas as pd file_link='https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/heights_two_feature.csv' df = pd.read_csv(file_link) # Open the file X = df[['Father', 'Mother']] # Assigning X y = df['Height'] # Assigning y print(X.head()) # Printing the first 5 rows of features
copy

Now we can use the LinearRegression class the same way as before.
It will automatically find the best parameters for our features (using the Normal Equation underneath).

1234567891011121314151617
import pandas as pd import numpy as np from sklearn.linear_model import LinearRegression # Import LinearRegression file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/heights_two_feature.csv' df = pd.read_csv(file_link) # Read the file X = df[['Father', 'Mother']] # Assign the feature (with double square brackets) y = df['Height'] # Assign the target (no need in double square brackets for target) model = LinearRegression() # Initialize a model model.fit(X, y) # Train a model # Creating a 2-D array of new instances X_new = np.array([ [65, 62], [70, 65], [75, 70] ]) print("Predictions: ", model.predict(X_new)) # Predict a target for new instances # Print the parameters (unnecessary if you only want to make the predictions) print("beta_1 and beta_2: ", model.coef_) print("beta_0: ", model.intercept_)
copy

Note

Now that our training set has 2 features, we need to provide 2 features for each new instance we want to predict.
That's why np.array([[65, 62],[70, 65],[75, 70]]) was used in the example above.
It predicts y for 3 new instances: [Father:65,Mother:62], [Father:70, Mother:65], [Father:75, Mother:70]

Nice! We made predictions using Multiple Linear Regression.
Using β‎ parameters, we received the following formula:
Person's height = 23.25 + 0.33 * Father's height + 0.32 * Mother's height.

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 2. Chapter 3
some-alt