Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Perceptron Layers | Neural Network from Scratch
Introduction to Neural Networks
course content

Course Content

Introduction to Neural Networks

Introduction to Neural Networks

1. Concept of Neural Network
2. Neural Network from Scratch
3. Conclusion

book
Perceptron Layers

Perceptron is the name of the simplest neural network, consisting of only one neuron. However, in order to be able to solve more complex problems, we will create a model called multilayer perceptron (MLP). A multilayer perceptron consists of one or more hidden layers. The structure of a multilayer perceptron looks like this:

  1. An Input Layer: it receives the input data;

  2. Hidden layers: these layers process the data and extract patterns.

  3. Output layer: produces the final prediction or classifications.

In general, each layer consists of multiple neurons, and the output from one layer becomes the input for the next layer.

Layer Weights and Biases

Before implementing a layer, it is important to understand how to store the weights and biases of each neuron within it. In the previous chapter, you learned how to store the weights of a single neuron as a vector and its bias as a scalar (single number).

Since a layer consists of multiple neurons, it is natural to represent the weights as a matrix, where each row corresponds to the weights of a specific neuron. Consequently, biases can be represented as a vector, whose length is equal to the number of neurons.

Given a layer with 33 inputs and 22 neurons, its weights will be stored in a 2Γ—32 \times 3 matrix WW and its biases will be stored in a 2Γ—12 \times 1 vector bb, which look as follows:

W=[W11W12W13W21W22W23]b=[b1b2]W = \begin{bmatrix} W_{11} & W_{12} & W_{13}\\ W_{21} & W_{22} & W_{23} \end{bmatrix} \qquad b = \begin{bmatrix} b_1\\ b_2 \end{bmatrix}

Here, element WijW_{ij} represents the weight of a jj-th input to the ii-th neuron, so the first row contains the weights of the first neuron, and the second row contains the weights of the second neuron. Element bib_i represents the bias of the ii-th neuron (two neurons – two biases).

Forward Propagation

Performing forward propagation for each layer means activating each of its neurons by computing the weighted sum of the inputs, adding the bias, and applying the activation function.

Previously, for a single neuron, you implemented weighted sum of the inputs by computing a dot product between the input vector and the weight vector and adding the bias.

Since each row of the weight matrix contains the weight vector for a particular neuron, all you have to do now is simply perform a dot product between each row of the matrix and the input vector. Luckily, this is exactly what matrix multiplication does:

To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:

Finally, the activation function is applied to the result β€” sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:

a=activation(Wx+b)a = activation(Wx + b)

where aa is the vector of neuron activations (outputs).

Layer Class

The perceptron's fundamental building blocks are its layers, therefore, it makes sense to create a separate Layer class. Its attributes include:

  • inputs: a vector of inputs (n_inputs is the number of inputs);

  • outputs: a vector of raw output values (before applying the activation function) of the neurons (n_neurons is the number of neurons);

  • weights: a weight matrix;

  • biases: a bias vector;

  • activation_function: the activation function used in the layer.

Like in the single neuron implementation, weights and biases will be initialized with random values between -1 and 1 drawn from a uniform distribution.

python

The inputs and outputs attributes will be used later in backpropagation, so it makes sense to initialize it as NumPy arrays of zeros.

Note
Note

Initializing inputs and outputs as zero-filled NumPy arrays prevents errors when performing calculations in forward and backward propagation. It also ensures consistency across layers, allowing smooth matrix operations without requiring additional checks.

Forward propagation can be implemented in the forward() method, where outputs are computed based on the inputs vector using NumPy, following the formula above:

Note
Note

Reshaping inputs into a column vector ensures correct matrix multiplication with the weight matrix during forward propagation. This prevents shape mismatches and allows seamless computations across all layers.

1. What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

2. Why do we apply this code before multiplying inputs by the weight matrix?

question mark

What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

Select the correct answer

question mark

Why do we apply this code before multiplying inputs by the weight matrix?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 3

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

course content

Course Content

Introduction to Neural Networks

Introduction to Neural Networks

1. Concept of Neural Network
2. Neural Network from Scratch
3. Conclusion

book
Perceptron Layers

Perceptron is the name of the simplest neural network, consisting of only one neuron. However, in order to be able to solve more complex problems, we will create a model called multilayer perceptron (MLP). A multilayer perceptron consists of one or more hidden layers. The structure of a multilayer perceptron looks like this:

  1. An Input Layer: it receives the input data;

  2. Hidden layers: these layers process the data and extract patterns.

  3. Output layer: produces the final prediction or classifications.

In general, each layer consists of multiple neurons, and the output from one layer becomes the input for the next layer.

Layer Weights and Biases

Before implementing a layer, it is important to understand how to store the weights and biases of each neuron within it. In the previous chapter, you learned how to store the weights of a single neuron as a vector and its bias as a scalar (single number).

Since a layer consists of multiple neurons, it is natural to represent the weights as a matrix, where each row corresponds to the weights of a specific neuron. Consequently, biases can be represented as a vector, whose length is equal to the number of neurons.

Given a layer with 33 inputs and 22 neurons, its weights will be stored in a 2Γ—32 \times 3 matrix WW and its biases will be stored in a 2Γ—12 \times 1 vector bb, which look as follows:

W=[W11W12W13W21W22W23]b=[b1b2]W = \begin{bmatrix} W_{11} & W_{12} & W_{13}\\ W_{21} & W_{22} & W_{23} \end{bmatrix} \qquad b = \begin{bmatrix} b_1\\ b_2 \end{bmatrix}

Here, element WijW_{ij} represents the weight of a jj-th input to the ii-th neuron, so the first row contains the weights of the first neuron, and the second row contains the weights of the second neuron. Element bib_i represents the bias of the ii-th neuron (two neurons – two biases).

Forward Propagation

Performing forward propagation for each layer means activating each of its neurons by computing the weighted sum of the inputs, adding the bias, and applying the activation function.

Previously, for a single neuron, you implemented weighted sum of the inputs by computing a dot product between the input vector and the weight vector and adding the bias.

Since each row of the weight matrix contains the weight vector for a particular neuron, all you have to do now is simply perform a dot product between each row of the matrix and the input vector. Luckily, this is exactly what matrix multiplication does:

To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:

Finally, the activation function is applied to the result β€” sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:

a=activation(Wx+b)a = activation(Wx + b)

where aa is the vector of neuron activations (outputs).

Layer Class

The perceptron's fundamental building blocks are its layers, therefore, it makes sense to create a separate Layer class. Its attributes include:

  • inputs: a vector of inputs (n_inputs is the number of inputs);

  • outputs: a vector of raw output values (before applying the activation function) of the neurons (n_neurons is the number of neurons);

  • weights: a weight matrix;

  • biases: a bias vector;

  • activation_function: the activation function used in the layer.

Like in the single neuron implementation, weights and biases will be initialized with random values between -1 and 1 drawn from a uniform distribution.

python

The inputs and outputs attributes will be used later in backpropagation, so it makes sense to initialize it as NumPy arrays of zeros.

Note
Note

Initializing inputs and outputs as zero-filled NumPy arrays prevents errors when performing calculations in forward and backward propagation. It also ensures consistency across layers, allowing smooth matrix operations without requiring additional checks.

Forward propagation can be implemented in the forward() method, where outputs are computed based on the inputs vector using NumPy, following the formula above:

Note
Note

Reshaping inputs into a column vector ensures correct matrix multiplication with the weight matrix during forward propagation. This prevents shape mismatches and allows seamless computations across all layers.

1. What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

2. Why do we apply this code before multiplying inputs by the weight matrix?

question mark

What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

Select the correct answer

question mark

Why do we apply this code before multiplying inputs by the weight matrix?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 3
some-alt