Conteúdo do Curso

Linear Regression for ML

1. Simple Linear Regression

What is Linear Regression Finding the Parameters Building the Linear Regression with scikit-learn Challenge Cheatsheet

2. Multiple Linear Regression

Linear Regression with Two Features Linear Regression with n Features Building Multiple Linear Regression Challenge Cheatsheet

3. Polynomial Regression

Quadratic Regression Polynomial Regression PolynomialFeatures Building the Polynomial Regression Interpolation vs Extrapolation Challenge Cheatsheet

4. Evaluating and Comparing Models

Metrics

Building the Linear Regression with scikit-learn

You already know what Simple Linear Regression is and how to find the line that fits the data best. Let's go through all the steps of building a linear regression for a real dataset.

Loading data and looking at it

We have a file, simple_height_data.csv, with the data from our examples. Let's load the file and take a look at it.


              123456
            
import pandas as pd

file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/simple_height_data.csv'
df = pd.read_csv(file_link)	# Read the file

print(df.head())	# Print the first 5 instances from a dataset

So the dataset has two columns: 'Height' - our target, and 'Father', the father's height. That is our feature.
Let's assign our target values to the y variable and feature values to X and build a scatterplot.


              123456789
            
import pandas as pd
import matplotlib.pyplot as plt

file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/simple_height_data.csv'
df = pd.read_csv(file_link)	# Read the file

X = df['Father']	# Assign the feature
y = df['Height']	# Assign the target
plt.scatter(X,y)	# Build scatterplot

Now that we got acquainted with our data let's build a model!

Building a Linear Regression

Building a Linear Regression model with scikit-learn is quite simple!
There is a LinearRegression class for that.

You need to:
1. Initialize the LinearRegression class.

model = LinearRegression()

2. Train the model with a training set.

model.fit(X, y)

3. Now you can predict new instances.

model.predict(X_new)

Before putting it all together, there is one more thing to figure out.
Both .fit() and .predict() methods of the LinearRegression class expect X (or X_new) to be a 2-D array (or pandas DataFrame).
Choosing a single column from a DataFrame (df['col_name']) returns a pandas Series, which is not what .fit() or .predict() expects, so the following error will be raised:
ValueError: Expected 2D array, got 1D array instead
To avoid it, we need to select a single column like this:

X = df[['col_name']] # with double squared brackets

Now let's build a Linear Regression and predict new values!


              12345678910111213
            
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression # Import LinearRegression

file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/simple_height_data.csv'
df = pd.read_csv(file_link)	# Read the file

X = df[['Father']]	# Assign the feature (with double square brackets)
y = df['Height']	# Assign the target (no need in double square brackets for target)
model = LinearRegression()  # Initialize a model
model.fit(X, y)  # Train a model
X_new = np.array([ [61], [64], [67] ]) # Creating a 2-D array of new instances
print(model.predict(X_new)) # Predict a target for new instances

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 1. Capítulo 3

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo