Course Content
Neural Networks with TensorFlow
Neural Networks with TensorFlow
Hyperparameter Tuning
Hyperparameter tuning is the process of finding the optimal set of hyperparameters (parameters that are set before the learning process begins) for a machine learning model. This process is crucial as the right combination of hyperparameters can significantly improve model performance.
Why Use Keras Tuner?
Keras Tuner is an easy-to-use, scalable hyperparameter tuning framework that solves the pain points of manually searching for the best hyperparameters. It supports various tuning strategies and integrates seamlessly with TensorFlow and Keras.
Random Search Optimization
Random Search Optimization is a technique that searches the hyperparameter space randomly within predefined bounds. Despite its simplicity, it can be surprisingly effective.
Implementing Random Search
-
Define the Model: Create a function that builds and compiles the model, leaving hyperparameters to be defined by the tuner.
-
Set Hyperparameter Space: Define the search space for hyperparameters like learning rate, number of layers, number of neurons, etc.
-
Initialize RandomSearch Tuner: Create an instance of
RandomSearch
tuner with your model, hyperparameter space, and other configurations. -
Search: Run the search method to find the best hyperparameters.
Rather than specifying fixed values for the number of neurons and learning rate, we define ranges for these hyperparameters to be explored:
- Using
hp.Int('units', min_value=32, max_value=512, step=32)
, we establish a range for the number of neurons, where 'units' is the label for this hyperparameter. The search will proceed in steps from 32 to 512. - With
hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])
, we offer a selection of values for the learning rate, naming this hyperparameter 'learning_rate'.
Note
The function for constructing the model must take a single argument, which is used for adjusting hyperparameters during tuning (
hp
in our case).
When creating a RandomSearch
instance, it's necessary to define the metric of interest (objective
) for the tuner to optimize and select the optimal model. Additionally, the total number of trials (max_trials
) must be set, with each trial representing a complete model training session using a unique set of randomized parameters.
Bayesian Optimization
Bayesian Optimization is a more sophisticated approach. It builds a probabilistic model to map hyperparameters to a probability of a score on the objective function and uses this to select the most promising hyperparameters to evaluate in the true objective function.
Implementing Bayesian Optimization
The process for implementing Bayesian Optimization is similar to that of Random Search, with the key distinction being the creation of a BayesianOptimization
object instead of RandomSearch
. Additionally, the parameter num_initial_points
needs to be set, representing the number of initial random steps taken before the Bayesian optimization begins. In this initial phase, hyperparameter combinations are selected randomly, akin to Random Search. After this phase, the optimization continues with max_trials - num_initial_points
steps focusing on Bayesian optimization.
Note
Increasing the number of initial random steps can enhance the likelihood of the Bayesian Optimization process finding the best parameters. When dealing with numerous hyperparameters, it's advisable to increase the initial points proportionally (e.g.,
3 * (number of hyperparameters)
). Alternatively, if computational resources are a constraint, a baseline of 10-20 initial steps can be effective.
Thanks for your feedback!