Course Content
Linear Regression for ML
Linear Regression for ML
Challenge
Let's build a real-world example regression model. We have a file, houses_simple.csv
, that holds information about housing prices with its area as a feature.
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
Let's assign variables and visualize our dataset!
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df[['square_feet']] y = df['price'] plt.scatter(X, y, alpha=0.5)
In the example with a person's height, it was much easier to imagine a line fitting the data well.
But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc.
Anyway, the task is to build the line that best fits the data we have; it will at least show the trend. The LinearRegression
class should be used for that. Soon we will learn how to add more features to improve the predictions!
Task
- Import the
LinearRegression
class fromsklearn.linear_model
. - Assign the
'square_feet'
column toX
.
Make sure you assign pandas DataFrame with a single column instead of pandas Series (refer to hint if needed). - Initialize the
LinearRegression
model. - Train the model.
- Predict the target for the
X_new
array.
Thanks for your feedback!
Challenge
Let's build a real-world example regression model. We have a file, houses_simple.csv
, that holds information about housing prices with its area as a feature.
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
Let's assign variables and visualize our dataset!
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df[['square_feet']] y = df['price'] plt.scatter(X, y, alpha=0.5)
In the example with a person's height, it was much easier to imagine a line fitting the data well.
But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc.
Anyway, the task is to build the line that best fits the data we have; it will at least show the trend. The LinearRegression
class should be used for that. Soon we will learn how to add more features to improve the predictions!
Task
- Import the
LinearRegression
class fromsklearn.linear_model
. - Assign the
'square_feet'
column toX
.
Make sure you assign pandas DataFrame with a single column instead of pandas Series (refer to hint if needed). - Initialize the
LinearRegression
model. - Train the model.
- Predict the target for the
X_new
array.
Thanks for your feedback!
Challenge
Let's build a real-world example regression model. We have a file, houses_simple.csv
, that holds information about housing prices with its area as a feature.
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
Let's assign variables and visualize our dataset!
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df[['square_feet']] y = df['price'] plt.scatter(X, y, alpha=0.5)
In the example with a person's height, it was much easier to imagine a line fitting the data well.
But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc.
Anyway, the task is to build the line that best fits the data we have; it will at least show the trend. The LinearRegression
class should be used for that. Soon we will learn how to add more features to improve the predictions!
Task
- Import the
LinearRegression
class fromsklearn.linear_model
. - Assign the
'square_feet'
column toX
.
Make sure you assign pandas DataFrame with a single column instead of pandas Series (refer to hint if needed). - Initialize the
LinearRegression
model. - Train the model.
- Predict the target for the
X_new
array.
Thanks for your feedback!
Let's build a real-world example regression model. We have a file, houses_simple.csv
, that holds information about housing prices with its area as a feature.
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
Let's assign variables and visualize our dataset!
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df[['square_feet']] y = df['price'] plt.scatter(X, y, alpha=0.5)
In the example with a person's height, it was much easier to imagine a line fitting the data well.
But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc.
Anyway, the task is to build the line that best fits the data we have; it will at least show the trend. The LinearRegression
class should be used for that. Soon we will learn how to add more features to improve the predictions!
Task
- Import the
LinearRegression
class fromsklearn.linear_model
. - Assign the
'square_feet'
column toX
.
Make sure you assign pandas DataFrame with a single column instead of pandas Series (refer to hint if needed). - Initialize the
LinearRegression
model. - Train the model.
- Predict the target for the
X_new
array.