Hyperparameter Tuning

Hyperparameters in Neural Networks

Neural networks, including perceptrons, have several hyperparameters that influence their performance. Unlike model parameters (e.g., weights and biases), hyperparameters are set before training begins. Some key hyperparameters in perceptrons include:

Number of hidden layers and neurons per layer: determines the model's capacity to learn complex patterns. Too few neurons can lead to underfitting, while too many can cause overfitting;
Learning rate: controls how much the model adjusts weights during training. A high learning rate can make training unstable, while a low one may lead to slow convergence:

Number of training epochs: defines how many times the model sees the training data. More epochs allow better learning but may lead to overfitting if excessive.

Note

To recap, overfitting occurs when a model learns the training data too well, capturing noise instead of general patterns. This results in high accuracy on the training set but poor generalization to unseen data.

Underfitting, on the other hand, happens when a model is too simple to capture the underlying patterns in the data. This leads to both poor training and test performance, indicating that the model lacks sufficient capacity to learn effectively.

Hyperparameter Tuning

Hyperparameter tuning is crucial for optimizing neural networks. A poorly tuned model can result in underfitting or overfitting.

You can tweak the number of epochs, the number of hidden layers, their size, and the learning rate to observe how the accuracy on the train and test sets changes:


              1234567891011121314151617181920212223
            
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import numpy as np
import warnings
# Ignore warnings
warnings.filterwarnings("ignore")
import os
os.system('wget https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f9fc718f-c98b-470d-ba78-d84ef16ba45f/section_2/data.py 2>/dev/null')
from data import X_train, y_train, X_test, y_test

np.random.seed(10)
# Tweak hyperparameters here
model = MLPClassifier(max_iter=100, hidden_layer_sizes=(6, 6), learning_rate_init=0.01, random_state=10)

model.fit(X_train, y_train)

y_pred_train = model.predict(X_train)
y_pred_test = model.predict(X_test)
# Comparing train set accuracy and test set accuracy
train_accuracy = accuracy_score(y_train, y_pred_train)
test_accuracy = accuracy_score(y_test, y_pred_test)
print(f'Train accuracy: {train_accuracy:.3f}')
print(f'Test accuracy: {test_accuracy:.3f}')

Finding the right combination of hyperparameters involves systematic experimentation and adjustments. This is often done using techniques like grid search (trying all possible combinations of hyperparameters) and random search (testing a random subset of hyperparameter values).

Essentially,training a neural network follows an iterative cycle:

Define the model with initial hyperparameters;
Train the model using the training dataset;
Evaluate performance on a test set;
Adjust hyperparameters (e.g., number of layers, learning rate);
Repeat the process until the desired performance is achieved.

This iterative refinement ensures that the model generalizes well to unseen data.

1. Which of the following is a hyperparameter rather than a model parameter?

2. A learning rate that is too high will most likely cause:

Tout était clair ?

Merci pour vos commentaires !

Section 3. Chapitre 2

Demandez à l'IA

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Contenu du cours

Introduction to Neural Networks

Hyperparameter Tuning

Hyperparameters in Neural Networks

Number of hidden layers and neurons per layer: determines the model's capacity to learn complex patterns. Too few neurons can lead to underfitting, while too many can cause overfitting;
Learning rate: controls how much the model adjusts weights during training. A high learning rate can make training unstable, while a low one may lead to slow convergence:

Number of training epochs: defines how many times the model sees the training data. More epochs allow better learning but may lead to overfitting if excessive.

Note

Hyperparameter Tuning

Hyperparameter tuning is crucial for optimizing neural networks. A poorly tuned model can result in underfitting or overfitting.

You can tweak the number of epochs, the number of hidden layers, their size, and the learning rate to observe how the accuracy on the train and test sets changes:


              1234567891011121314151617181920212223
            
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import numpy as np
import warnings
# Ignore warnings
warnings.filterwarnings("ignore")
import os
os.system('wget https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f9fc718f-c98b-470d-ba78-d84ef16ba45f/section_2/data.py 2>/dev/null')
from data import X_train, y_train, X_test, y_test

np.random.seed(10)
# Tweak hyperparameters here
model = MLPClassifier(max_iter=100, hidden_layer_sizes=(6, 6), learning_rate_init=0.01, random_state=10)

model.fit(X_train, y_train)

y_pred_train = model.predict(X_train)
y_pred_test = model.predict(X_test)
# Comparing train set accuracy and test set accuracy
train_accuracy = accuracy_score(y_train, y_pred_train)
test_accuracy = accuracy_score(y_test, y_pred_test)
print(f'Train accuracy: {train_accuracy:.3f}')
print(f'Test accuracy: {test_accuracy:.3f}')

Essentially,training a neural network follows an iterative cycle:

Define the model with initial hyperparameters;
Train the model using the training dataset;
Evaluate performance on a test set;
Adjust hyperparameters (e.g., number of layers, learning rate);
Repeat the process until the desired performance is achieved.

This iterative refinement ensures that the model generalizes well to unseen data.

1. Which of the following is a hyperparameter rather than a model parameter?

2. A learning rate that is too high will most likely cause:

Tout était clair ?

Merci pour vos commentaires !

Section 3. Chapitre 2