Course Content
Introduction to Neural Networks
Introduction to Neural Networks
Perceptron Layers
Perceptron is the name of the simplest neural network, consisting of only one hidden layer. However, in order to be able to solve more complex problems, we will create a variation of perceptron called multilayer perceptron (MLP). A multilayer perceptron consists of multiple hidden layers. The structure of a multilayer perceptron looks like this:
- An Input Layer: it receives the input data;
- Hidden Layers: these layers process the data and extract patterns. We have two hidden layers in our model;
- Output Layer: produces the final prediction or classifications.
In general, each layer consists of multiple neurons, and the output from one layer becomes the input for the next layer.
Layer Weights and Biases
Before implementing a layer, it is important to understand how to store the weights and biases of each neuron within it. In the previous chapter, you learned how to store the weights of a single neuron as a vector and its bias as a scalar (single number).
Since a layer consists of multiple neurons, it is natural to represent the weights as a matrix, where each row corresponds to the weights of a specific neuron. Consequently, biases can be represented as a vector, whose length is equal to the number of neurons.
Given a layer with 3 inputs and 2 neurons, its weights will be stored in a 2x3 matrix W, and its biases will be stored in a 2x1 vector b, which look as follows:
Here, element Wij represents the weight of a j-th input to the i-th neuron, so the first row contains the weights of the first neuron, and the second row contains the weights of the second neuron. Element bi represents the bias of the i-th neuron (two neurons – two biases).
Forward Propagation
Performing forward propagation for each layer means activating each of its neurons by computing the weighted sum of the inputs, adding the bias, and applying the activation function.
Previously, for a single neuron, you implemented weighted sum of the inputs by computing a dot product between the input vector and the weight vector and adding the bias.
Since each row of the weight matrix contains the weight vector for a particular neuron, all you have to do now is simply perform a dot product between each row of the matrix and the input vector. Luckily, this is exactly what matrix multiplication does:
To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:
Finally, the activation function is applied to the result — sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:
where a is the vector of neuron activations (outputs).
Layer Class
The perceptron's fundamental building blocks are its layers, therefore, it makes sense to create a separate Layer
class. Its attributes include:
inputs
: a vector of inputs (n_inputs
is the number of inputs);outputs
: a vector of raw output values (before applying the activation function) of the neurons (n_neurons
is the number of neurons);weights
: a weight matrix;biases
: a bias vector;activation_function
: the activation function used in the layer.
Like in the single neuron implementation, weights
and biases
will be initialized with random values between -1
and 1
drawn from a uniform distribution.
The inputs
and outputs
attributes will be used later in backpropagation, so it makes sense to initialize it as NumPy arrays of zeros.
Forward propagation can be implemented in the forward()
method, where outputs
are computed based on the inputs
vector using NumPy, following the formula above:
Thanks for your feedback!