Course Content
PyTorch Essentials
PyTorch Essentials
Training the Model
In this chapter, we’ll focus on training the neural network we created in the previous chapter using the wine quality dataset. The goal is to predict wine quality categories based on its features. We'll define the optimizer, loss function, and training loop while monitoring the model's performance over multiple epochs.
Preparing for Training
First, we need to ensure that the model, loss function, and optimizer are properly defined. Let’s go through each step:
- Loss function: for classification, we use
CrossEntropyLoss
, which expects raw logits as input and automatically appliessoftmax
. - Optimizer: we'll use the Adam optimizer for efficient gradient updates.
Training Loop
The training loop involves the following steps for each epoch:
- Forward Pass: Pass the input features through the model to generate predictions.
- Loss Calculation: Compare the predictions with the ground truth using the loss function.
- Backward Pass: Compute gradients with respect to the model parameters using backpropagation.
- Parameter Update: Adjust model parameters using the optimizer.
- Monitoring Progress: Print the loss periodically to observe convergence.
Implementation
Here's how the training loop is implemented:
Observing Convergence
- Convergence point: look for the point where the training loss stabilizes. If the loss stops decreasing significantly, it indicates that the model has likely converged.
- Adjusting hyperparameters: If the loss doesn’t decrease well, consider:
- Lowering the learning rate.
- Increasing the number of epochs.
- Checking the input data for proper scaling and quality.
import torch.nn as nn import torch.optim as optim import matplotlib.pyplot as plt # Define the loss function (Cross-Entropy for multi-class classification) criterion = nn.CrossEntropyLoss() # Define the optimizer (Adam with a learning rate of 0.01) optimizer = optim.Adam(model.parameters(), lr=0.01) # Set manual seed for reproducibility torch.manual_seed(42) # Number of epochs epochs = 100 # Store losses for plotting training_losses = [] # Training loop for epoch in range(epochs): # Zero out gradients from the previous step optimizer.zero_grad() # Forward pass: Compute predictions predictions = model(X_train) # Compute the loss loss = criterion(predictions, y_train) # Backward pass: Compute gradients loss.backward() # Update parameters optimizer.step() # Store the loss training_losses.append(loss.item()) # Print loss every 10 epochs if (epoch + 1) % 10 == 0: print(f"Epoch {epoch+1}/{epochs}, Loss: {loss.item():.4f}") # Plot the training loss plt.plot(range(epochs), training_losses, label="Training Loss") plt.xlabel("Epoch") plt.ylabel("Loss") plt.title("Training Loss Over Epochs") plt.legend() plt.show()
Everything was clear?
Thanks for your feedback!
Section 3. Chapter 2