Course Content
Linear Regression for ML
Linear Regression for ML
Building Multiple Linear Regression
Building a Multiple Linear Regression is as easy as building a Simple Linear Regression with scikit-learn!
You need to use the same LinearRegression
class.
The only difference is that the .coef_
attribute now returns an array of β₁ to βₙ parameters.
Also, X
and X_new
now hold more than one column.
Looking at data
import pandas as pd file_link='https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/heights_two_feature.csv' df = pd.read_csv(file_link) # Open the file print(df.head()) # Printing the first 5 rows of df
Here 'Height'
is the target, and 'Father'
, 'Mother'
are the features.
So we will assign 'Father'
and 'Mother'
columns as the X
and 'Height'
as the y
.
import pandas as pd file_link='https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/heights_two_feature.csv' df = pd.read_csv(file_link) # Open the file X = df[['Father', 'Mother']] # Assigning X y = df['Height'] # Assigning y print(X.head()) # Printing the first 5 rows of features
Now we can use the LinearRegression
class the same way as before.
It will automatically find the best parameters for our features (using the Normal Equation underneath).
import pandas as pd import numpy as np from sklearn.linear_model import LinearRegression # Import LinearRegression file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/heights_two_feature.csv' df = pd.read_csv(file_link) # Read the file X = df[['Father', 'Mother']] # Assign the feature (with double square brackets) y = df['Height'] # Assign the target (no need in double square brackets for target) model = LinearRegression() # Initialize a model model.fit(X, y) # Train a model # Creating a 2-D array of new instances X_new = np.array([ [65, 62], [70, 65], [75, 70] ]) print("Predictions: ", model.predict(X_new)) # Predict a target for new instances # Print the parameters (unnecessary if you only want to make the predictions) print("beta_1 and beta_2: ", model.coef_) print("beta_0: ", model.intercept_)
Note
Now that our training set has 2 features, we need to provide 2 features for each new instance we want to predict.
That's whynp.array([[65, 62],[70, 65],[75, 70]])
was used in the example above.
It predictsy
for 3 new instances: [Father:65,Mother:62], [Father:70, Mother:65], [Father:75, Mother:70]
Nice! We made predictions using Multiple Linear Regression.
Using β parameters, we received the following formula:
Person's height = 23.25 + 0.33 * Father's height + 0.32 * Mother's height.
Thanks for your feedback!