Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Training the Perceptron | Neural Network from Scratch
Introduction to Neural Networks

bookChallenge: Training the Perceptron

Before proceeding with training the perceptron, keep in mind that it uses the binary cross-entropy loss function discussed earlier. The final key concept before implementing backpropagation is the formula for the derivative of this loss function with respect to the output activations, ana^n. Below are the formulas for the loss function and its derivative:

L=βˆ’(ylog⁑(y^)+(1βˆ’y)log⁑(1βˆ’y^))dan=y^βˆ’yy^(1βˆ’y^)\begin{aligned} L &= -(y \log(\hat{y}) + (1-y) \log(1 - \hat{y}))\\ da^n &= \frac {\hat{y} - y} {\hat{y}(1 - \hat{y})} \end{aligned}

where an=y^a^n = \hat{y}

To verify that the perceptron is training correctly, the fit() method also prints the average loss at each epoch. This is calculated by averaging the loss over all training examples in that epoch:

for epoch in range(epochs):
    loss = 0

    for i in range(training_data.shape[0]):
        loss += -(target * np.log(output) + (1 - target) * np.log(1 - output))

average_loss = loss[0, 0] / training_data.shape[0]
print(f'Loss at epoch {epoch + 1}: {average_loss:.3f}')
L=βˆ’1Nβˆ‘i=1N(yilog⁑(y^i)+(1βˆ’yi)log⁑(1βˆ’y^i))L = -\frac1N \sum_{i=1}^N (y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i))

Finally, the formulas for computing gradients are as follows:

dzl=dalβŠ™fβ€²l(zl)dWl=dzlβ‹…(alβˆ’1)Tdbl=dzldalβˆ’1=(Wl)Tβ‹…dzl\begin{aligned} dz^l &= da^l \odot f'^l(z^l)\\ dW^l &= dz^l \cdot (a^{l-1})^T\\ db^l &= dz^l\\ da^{l-1} &= (W^l)^T \cdot dz^l \end{aligned}

The sample training data (X_train) along with the corresponding labels (y_train) are stored as NumPy arrays in the utils.py file. Additionally, instances of the activation functions are also defined there:

relu = ReLU()
sigmoid = Sigmoid()
Task

Swipe to start coding

Your goal is to complete the training process for a multilayer perceptron by implementing backpropagation and updating the model parameters.

Follow these steps carefully:

  1. Implement the backward() method in the Layer class:
    • Compute the following gradients:
      • dz: derivative of the loss with respect to the pre-activation values, using the derivative of the activation function;
      • d_weights: gradient of the loss with respect to the weights, calculated as the dot product of dz and the transposed input vector;
      • d_biases: gradient of the loss with respect to the biases, equal to dz;
      • da_prev: gradient of the loss with respect to the activations of the previous layer, obtained by multiplying the transposed weight matrix by dz.
    • Update the weights and biases using the learning rate.
  2. Complete the fit() method in the Perceptron class:
    • Compute the model output by calling the forward() method;
    • Calculate the loss using the cross-entropy formula;
    • Compute danda^n β€” the derivative of the loss with respect to the output activations;
    • Loop backward through the layers, performing backpropagation by calling each layer's backward() method.
  3. Check the training behavior:
    • If everything is implemented correctly, the loss should steadily decrease with each epoch when using a learning rate of 0.01.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 10
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

close

Awesome!

Completion rate improved to 4

bookChallenge: Training the Perceptron

Swipe to show menu

Before proceeding with training the perceptron, keep in mind that it uses the binary cross-entropy loss function discussed earlier. The final key concept before implementing backpropagation is the formula for the derivative of this loss function with respect to the output activations, ana^n. Below are the formulas for the loss function and its derivative:

L=βˆ’(ylog⁑(y^)+(1βˆ’y)log⁑(1βˆ’y^))dan=y^βˆ’yy^(1βˆ’y^)\begin{aligned} L &= -(y \log(\hat{y}) + (1-y) \log(1 - \hat{y}))\\ da^n &= \frac {\hat{y} - y} {\hat{y}(1 - \hat{y})} \end{aligned}

where an=y^a^n = \hat{y}

To verify that the perceptron is training correctly, the fit() method also prints the average loss at each epoch. This is calculated by averaging the loss over all training examples in that epoch:

for epoch in range(epochs):
    loss = 0

    for i in range(training_data.shape[0]):
        loss += -(target * np.log(output) + (1 - target) * np.log(1 - output))

average_loss = loss[0, 0] / training_data.shape[0]
print(f'Loss at epoch {epoch + 1}: {average_loss:.3f}')
L=βˆ’1Nβˆ‘i=1N(yilog⁑(y^i)+(1βˆ’yi)log⁑(1βˆ’y^i))L = -\frac1N \sum_{i=1}^N (y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i))

Finally, the formulas for computing gradients are as follows:

dzl=dalβŠ™fβ€²l(zl)dWl=dzlβ‹…(alβˆ’1)Tdbl=dzldalβˆ’1=(Wl)Tβ‹…dzl\begin{aligned} dz^l &= da^l \odot f'^l(z^l)\\ dW^l &= dz^l \cdot (a^{l-1})^T\\ db^l &= dz^l\\ da^{l-1} &= (W^l)^T \cdot dz^l \end{aligned}

The sample training data (X_train) along with the corresponding labels (y_train) are stored as NumPy arrays in the utils.py file. Additionally, instances of the activation functions are also defined there:

relu = ReLU()
sigmoid = Sigmoid()
Task

Swipe to start coding

Your goal is to complete the training process for a multilayer perceptron by implementing backpropagation and updating the model parameters.

Follow these steps carefully:

  1. Implement the backward() method in the Layer class:
    • Compute the following gradients:
      • dz: derivative of the loss with respect to the pre-activation values, using the derivative of the activation function;
      • d_weights: gradient of the loss with respect to the weights, calculated as the dot product of dz and the transposed input vector;
      • d_biases: gradient of the loss with respect to the biases, equal to dz;
      • da_prev: gradient of the loss with respect to the activations of the previous layer, obtained by multiplying the transposed weight matrix by dz.
    • Update the weights and biases using the learning rate.
  2. Complete the fit() method in the Perceptron class:
    • Compute the model output by calling the forward() method;
    • Calculate the loss using the cross-entropy formula;
    • Compute danda^n β€” the derivative of the loss with respect to the output activations;
    • Loop backward through the layers, performing backpropagation by calling each layer's backward() method.
  3. Check the training behavior:
    • If everything is implemented correctly, the loss should steadily decrease with each epoch when using a learning rate of 0.01.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 10
single

single

some-alt