Perceptron Layers

Perceptron is the name of the simplest neural network, consisting of only one hidden layer. However, in order to be able to solve more complex problems, we will create a variation of perceptron called multilayer perceptron (MLP). A multilayer perceptron consists of multiple hidden layers. The structure of a multilayer perceptron looks like this:

An Input Layer: it receives the input data;
Hidden layers: these layers process the data and extract patterns. We have two hidden layers in our model;
Output layer: produces the final prediction or classifications.

In general, each layer consists of multiple neurons, and the output from one layer becomes the input for the next layer.

Layer Weights and Biases

Before implementing a layer, it is important to understand how to store the weights and biases of each neuron within it. In the previous chapter, you learned how to store the weights of a single neuron as a vector and its bias as a scalar (single number).

Since a layer consists of multiple neurons, it is natural to represent the weights as a matrix, where each row corresponds to the weights of a specific neuron. Consequently, biases can be represented as a vector, whose length is equal to the number of neurons.

Given a layer with 3 inputs and 2 neurons, its weights will be stored in a 2x3 matrix W, and its biases will be stored in a 2x1 vector b, which look as follows:

Here, element W_ij represents the weight of a j-th input to the i-th neuron, so the first row contains the weights of the first neuron, and the second row contains the weights of the second neuron. Element b_i represents the bias of the i-th neuron (two neurons – two biases).

Forward Propagation

Performing forward propagation for each layer means activating each of its neurons by computing the weighted sum of the inputs, adding the bias, and applying the activation function.

Previously, for a single neuron, you implemented weighted sum of the inputs by computing a dot product between the input vector and the weight vector and adding the bias.

Since each row of the weight matrix contains the weight vector for a particular neuron, all you have to do now is simply perform a dot product between each row of the matrix and the input vector. Luckily, this is exactly what matrix multiplication does:

To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:

Finally, the activation function is applied to the result — sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:

where a is the vector of neuron activations (outputs).

Layer Class

The perceptron's fundamental building blocks are its layers, therefore, it makes sense to create a separate Layer class. Its attributes include:

inputs: a vector of inputs (n_inputs is the number of inputs);
outputs: a vector of raw output values (before applying the activation function) of the neurons (n_neurons is the number of neurons);
weights: a weight matrix;
biases: a bias vector;
activation_function: the activation function used in the layer.

Like in the single neuron implementation, weights and biases will be initialized with random values between -1 and 1 drawn from a uniform distribution.

python

The inputs and outputs attributes will be used later in backpropagation, so it makes sense to initialize it as NumPy arrays of zeros.

Forward propagation can be implemented in the forward() method, where outputs are computed based on the inputs vector using NumPy, following the formula above:

python

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 3