Oppiskele Linear Regression | Edistyneemmät Käsitteet

Pyyhkäise näyttääksesi valikon

We will use a real dataset to implement linear regression in PyTorch. The dataset contains two columns:

'Number of Appliances': the number of appliances in a household (input feature, X);
'Electricity Bill': the corresponding electricity bill amount (target output, Y).

1. Loading and Inspecting the Dataset

The dataset is stored in a CSV file. We'll load it using pandas and inspect the first few rows:


              12345
            
import pandas as pd
# Load the dataset
bills_df = pd.read_csv('https://staging-content-media-cdn.codefinity.com/courses/1dd2b0f6-6ec0-40e6-a570-ed0ac2209666/section_2/electricity_bills.csv')
# Display the first five rows
print(bills_df.head())

2. Preparing the Data for PyTorch

Next, we should extract the input X and target Y columns, convert them into PyTorch tensors, and reshape them into 2D tensors to ensure compatibility with PyTorch's operations:


              12345678910
            
import torch
import pandas as pd
# Load the dataset
bills_df = pd.read_csv('https://staging-content-media-cdn.codefinity.com/courses/1dd2b0f6-6ec0-40e6-a570-ed0ac2209666/section_2/electricity_bills.csv')
# Extract input (Number of Appliances) and target (Electricity Bill)
X = torch.tensor(bills_df['Number of Appliances'].values).float().reshape(-1, 1)
Y = torch.tensor(bills_df['Electricity Bill'].values).float().reshape(-1, 1)
# Print the shapes of X and Y
print(f"Shape of X: {X.shape}")
print(f"Shape of Y: {Y.shape}")

3. Defining the Linear Model

The nn.Linear module in PyTorch defines a fully connected layer, performing y = xWT + b. It's a fundamental building block in neural networks and can be combined with other layers for more complex architectures.

Its key parameters are as follows:

in_features: number of input features (independent variables);
out_features: number of output features (predicted values).

For simple linear regression, like in our case, we predict a single output based on one input. Thus:

in_features=1: one input variable;
out_features=1: one predicted value.

import torch.nn as nn
# Define the linear regression model
model = nn.Linear(in_features=1, out_features=1)

4. Defining the Loss Function and Optimizer

We'll use mean squared error (MSE) as the loss function and stochastic gradient descent (SGD) as the optimizer with the learning rate equal to 0.005.

The MSE loss can be defined using the nn.MSELoss class, and SGD using the respective class from the torch.optim module.

import torch.optim as optim
# Define the loss function (MSE)
loss_fn = nn.MSELoss()
# Define the optimizer (SGD)
optimizer = optim.SGD(model.parameters(), lr=0.005)

5. Training the Model

Training involves performing a forward pass and a backward pass for a specified number of epochs.

Forward pass: this step computes the model's predictions based on the input data and calculates the loss by comparing the predictions to the actual target values;
Backward pass: this step calculates gradients using backpropagation (based on the loss) and updates the model's weights and biases using an optimization algorithm, which is SGD in our case.

This process repeats for the specified number of epochs to minimize the loss and improve the model's performance.


              1234567891011121314151617181920212223242526272829303132333435
            
import torch
import torch.nn as nn
import torch.optim as optim
import pandas as pd
# Load the dataset
bills_df = pd.read_csv('https://staging-content-media-cdn.codefinity.com/courses/1dd2b0f6-6ec0-40e6-a570-ed0ac2209666/section_2/electricity_bills.csv')
# Extract input (Number of Appliances) and target (Electricity Bill)
X = torch.tensor(bills_df['Number of Appliances'].values).float().reshape(-1, 1)
Y = torch.tensor(bills_df['Electricity Bill'].values).float().reshape(-1, 1)
# Define the linear regression model
model = nn.Linear(in_features=1, out_features=1)
# Define the loss function (MSE)
loss_fn = nn.MSELoss()
# Define the optimizer (SGD)
optimizer = optim.SGD(model.parameters(), lr=0.005)
# Training loop
epochs = 100
for epoch in range(epochs):
    # Forward pass
    Y_pred = model(X)
    loss = loss_fn(Y_pred, Y)

    # Backward pass
    optimizer.zero_grad()  # Reset gradients
    loss.backward()        # Compute gradients

    # Update parameters
    optimizer.step()

    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}")

# Final parameters
print(f"Trained weight: {model.weight.item()}")
print(f"Trained bias: {model.bias.item()}")

The parameters of the model, namely its weights and biases, can be accessed using the .weight and .bias attributes:

weights = model.weight.item()
biases = model.bias.item()

Note

optimizer.zero_grad() is important because it resets the gradients of all parameters before backpropagation. Without this step, gradients would accumulate from previous steps, leading to incorrect updates to the model's weights. This ensures each backward pass starts with a clean slate.

Oliko kaikki selvää?

Kiitos palautteestasi!

Osio 2. Luku 3

Kysy tekoälyä

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

Osio 2. Luku 3