Learn Multi-Step Backpropagation | More Advanced Concepts

Like Tensorflow, PyTorch also allows you to build more complex computational graphs involving multiple intermediate tensors.


              12345678910111213
            
import torch
# Create a 2D tensor with gradient tracking
x = torch.tensor([[1.0, 2.0, 3.0], [3.0, 2.0, 1.0]], requires_grad=True)
# Define intermediate layers
y = 6 * x + 3
z = 10 * y ** 2
# Compute the mean of the final output
output_mean = z.mean()
print(f"Output: {output_mean}")
# Perform backpropagation
output_mean.backward()
# Print the gradient of x
print("Gradient of x:\n", x.grad)

The gradient of output_mean with respect to x is computed using the chain rule. The result shows how much a small change in each element of x affects output_mean.

Disabling Gradient Tracking

In some cases, you may want to disable gradient tracking to save memory and computation. Since requires_grad=False is the default behavior, you can simply create the tensor without specifying this parameter:

x = torch.tensor([[1.0, 2.0, 3.0], [3.0, 2.0, 1.0]])

Task

Swipe to start coding

You are tasked with building a simple neural network in PyTorch. Your goal is to compute the gradient of the loss with respect to the weight matrix.

Define a random weight matrix (tensor) W of shape 1x3 initialized with values from a uniform distribution over [0, 1], with gradient tracking enabled.
Create an input matrix (tensor) X based on this list: [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]].
Perform matrix multiplication between W and X to calculate Y.
Compute mean squared error (MSE): loss = mean((Y - Y_target)2).
Calculate the gradient of the loss (loss) with respect to W using backpropagation.
Print the computed gradient of W.

Solution

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 2

single

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain how the gradients are calculated in this example?

What happens if I change the values in the tensor `x`?

How does disabling gradient tracking affect performance?

Swipe to show menu