Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Hyperparameter Tuning and Early Stopping | Evaluation, Optimization, and Deployment
Practice
Projects
Quizzes & Challenges
Quizzen
Challenges
/
Fine-Tuning Transformers

bookHyperparameter Tuning and Early Stopping

Veeg om het menu te tonen

When fine-tuning transformer models, you need to carefully select and adjust certain key hyperparameters to achieve the best performance. The most important hyperparameters include learning rate, batch size, and number of epochs. The learning rate controls how much the model weights are updated with respect to the loss gradient during each optimization step. A learning rate that is too high may cause the model to converge too quickly to a suboptimal solution or even diverge, while a rate that is too low may result in slow training and risk getting stuck in local minima. The batch size determines how many samples are processed before the model updates its weights. Smaller batch sizes can introduce more noise into the training process, potentially aiding generalization, but may also slow down training. Larger batch sizes can speed up training but may require more memory and sometimes lead to poorer generalization. The number of epochs specifies how many times the model will iterate over the entire training dataset. Too many epochs can cause the model to overfit, especially on small datasets, while too few can result in underfitting.

Note
Note

Early stopping prevents overfitting on small datasets.

12345678910111213141516171819202122
from transformers import Trainer, TrainingArguments, EarlyStoppingCallback training_args = TrainingArguments( output_dir="./results", evaluation_strategy="steps", eval_steps=50, learning_rate=2e-5, # Adjusted learning rate per_device_train_batch_size=8, num_train_epochs=10, save_total_limit=2, load_best_model_at_end=True, ) callbacks = [EarlyStoppingCallback(early_stopping_patience=2)] trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=eval_dataset, callbacks=callbacks, )
copy
Note
Note

Tune one hyperparameter at a time for clarity.

question mark

Why is early stopping useful during fine-tuning?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 4. Hoofdstuk 2

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Sectie 4. Hoofdstuk 2
some-alt