Свайпніть щоб показати меню

Challenge: Training the Perceptron

Before proceeding with training the perceptron, keep in mind that it uses the binary cross-entropy loss function discussed earlier. The final key concept before implementing backpropagation is the formula for the derivative of this loss function with respect to the output activations, $a^n$ . Below are the formulas for the loss function and its derivative:

\begin{aligned} L &= -(y \log(\hat{y}) + (1-y) \log(1 - \hat{y}))\\ da^n &= \frac {\hat{y} - y} {\hat{y}(1 - \hat{y})} \end{aligned}

where $a^n = \hat{y}$

To verify that the perceptron is training correctly, the fit() method also prints the average loss at each epoch. This is calculated by averaging the loss over all training examples in that epoch:


python

L = -\frac1N \sum_{i=1}^N (y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i))

Finally, the formulas for computing gradients are as follows:

\begin{aligned} dz^l &= da^l \odot f'^l(z^l)\\ dW^l &= dz^l \cdot (a^{l-1})^T\\ db^l &= dz^l\\ da^{l-1} &= (W^l)^T \cdot dz^l \end{aligned}

The sample training data (X_train) along with the corresponding labels (y_train) are stored as NumPy arrays in the utils.py file. Additionally, instances of the activation functions are also defined there:


python

Завдання

Swipe to start coding

Compute the following gradients: dz, d_weights, d_biases, and da_prev in the backward() method of the Layer class.
Compute the output of the model in the fit() method of the Perceptron class.
Compute da ( $da^n$ ) before the loop, which is the gradient of the loss with respect to output activations.
Compute da and perform backpropagation in the loop by calling the appropriate method for each of the layers.

If you implemented training correctly, given the learning rate of 0.01, the loss should steadily decrease with each epoch.

Рішення

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 2. Розділ 10

single

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Challenge: Training the Perceptron

\begin{aligned} L &= -(y \log(\hat{y}) + (1-y) \log(1 - \hat{y}))\\ da^n &= \frac {\hat{y} - y} {\hat{y}(1 - \hat{y})} \end{aligned}

where $a^n = \hat{y}$


python

L = -\frac1N \sum_{i=1}^N (y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i))

Finally, the formulas for computing gradients are as follows:

\begin{aligned} dz^l &= da^l \odot f'^l(z^l)\\ dW^l &= dz^l \cdot (a^{l-1})^T\\ db^l &= dz^l\\ da^{l-1} &= (W^l)^T \cdot dz^l \end{aligned}


python

Завдання

Swipe to start coding

Compute the following gradients: dz, d_weights, d_biases, and da_prev in the backward() method of the Layer class.
Compute the output of the model in the fit() method of the Perceptron class.
Compute da ( $da^n$ ) before the loop, which is the gradient of the loss with respect to output activations.
Compute da and perform backpropagation in the loop by calling the appropriate method for each of the layers.

If you implemented training correctly, given the learning rate of 0.01, the loss should steadily decrease with each epoch.

Рішення

Все було зрозуміло?

Дякуємо за ваш відгук!

Свайпніть щоб показати меню

Challenge: Training the Perceptron

Рішення

Awesome!

Challenge: Training the Perceptron

Рішення

Awesome!