Contenido del Curso
Introduction to Neural Networks
Introduction to Neural Networks
Hyperparameter Tuning
Hyperparameters in Neural Networks
Neural networks, including perceptrons, have several hyperparameters that influence their performance. Unlike model parameters (e.g., weights and biases), hyperparameters are set before training begins. Some key hyperparameters in perceptrons include:
Number of hidden layers and neurons per layer: determines the model's capacity to learn complex patterns. Too few neurons can lead to underfitting, while too many can cause overfitting;
Learning rate: controls how much the model adjusts weights during training. A high learning rate can make training unstable, while a low one may lead to slow convergence:
Number of training epochs: defines how many times the model sees the training data. More epochs allow better learning but may lead to overfitting if excessive.
To recap, overfitting occurs when a model learns the training data too well, capturing noise instead of general patterns. This results in high accuracy on the training set but poor generalization to unseen data.
Underfitting, on the other hand, happens when a model is too simple to capture the underlying patterns in the data. This leads to both poor training and test performance, indicating that the model lacks sufficient capacity to learn effectively.
Hyperparameter Tuning
Hyperparameter tuning is crucial for optimizing neural networks. A poorly tuned model can result in underfitting or overfitting.
You can tweak the number of epochs, the number of hidden layers, their size, and the learning rate to observe how the accuracy on the train and test sets changes:
from sklearn.neural_network import MLPClassifier from sklearn.metrics import accuracy_score import numpy as np import warnings # Ignore warnings warnings.filterwarnings("ignore") import os os.system('wget https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f9fc718f-c98b-470d-ba78-d84ef16ba45f/section_2/data.py 2>/dev/null') from data import X_train, y_train, X_test, y_test np.random.seed(10) # Tweak hyperparameters here model = MLPClassifier(max_iter=100, hidden_layer_sizes=(6, 6), learning_rate_init=0.01, random_state=10) model.fit(X_train, y_train) y_pred_train = model.predict(X_train) y_pred_test = model.predict(X_test) # Comparing train set accuracy and test set accuracy train_accuracy = accuracy_score(y_train, y_pred_train) test_accuracy = accuracy_score(y_test, y_pred_test) print(f'Train accuracy: {train_accuracy:.3f}') print(f'Test accuracy: {test_accuracy:.3f}')
Finding the right combination of hyperparameters involves systematic experimentation and adjustments. This is often done using techniques like grid search (trying all possible combinations of hyperparameters) and random search (testing a random subset of hyperparameter values).
Essentially,training a neural network follows an iterative cycle:
Define the model with initial hyperparameters;
Train the model using the training dataset;
Evaluate performance on a test set;
Adjust hyperparameters (e.g., number of layers, learning rate);
Repeat the process until the desired performance is achieved.
This iterative refinement ensures that the model generalizes well to unseen data.
1. Which of the following is a hyperparameter rather than a model parameter?
2. A learning rate that is too high will most likely cause:
¡Gracias por tus comentarios!