Leer Hyperparameter Tuning and Early Stopping | Evaluation, Optimization, and Deployment

Veeg om het menu te tonen

When fine-tuning transformer models, you need to carefully select and adjust certain key hyperparameters to achieve the best performance. The most important hyperparameters include learning rate, batch size, and number of epochs. The learning rate controls how much the model weights are updated with respect to the loss gradient during each optimization step. A learning rate that is too high may cause the model to converge too quickly to a suboptimal solution or even diverge, while a rate that is too low may result in slow training and risk getting stuck in local minima. The batch size determines how many samples are processed before the model updates its weights. Smaller batch sizes can introduce more noise into the training process, potentially aiding generalization, but may also slow down training. Larger batch sizes can speed up training but may require more memory and sometimes lead to poorer generalization. The number of epochs specifies how many times the model will iterate over the entire training dataset. Too many epochs can cause the model to overfit, especially on small datasets, while too few can result in underfitting.

Note

Early stopping prevents overfitting on small datasets.


              12345678910111213141516171819202122
            
from transformers import Trainer, TrainingArguments, EarlyStoppingCallback

training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="steps",
    eval_steps=50,
    learning_rate=2e-5,  # Adjusted learning rate
    per_device_train_batch_size=8,
    num_train_epochs=10,
    save_total_limit=2,
    load_best_model_at_end=True,
)

callbacks = [EarlyStoppingCallback(early_stopping_patience=2)]

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    callbacks=callbacks,
)

Note

Tune one hyperparameter at a time for clarity.

Was alles duidelijk?

Bedankt voor je feedback!

Sectie 4. Hoofdstuk 2

Vraag AI

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Sectie 4. Hoofdstuk 2