Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Pre-Training and Fine-Tuning Concepts | Introduction to Transformers and Transfer Learning
Fine-Tuning Transformers

bookPre-Training and Fine-Tuning Concepts

Stryg for at vise menuen

Transformers have revolutionized natural language processing (NLP) by enabling models to learn from vast amounts of text data and then adapt to specific applications with relatively little additional data. This process involves two key stages: pre-training and fine-tuning. During pre-training, a transformer model such as BERT is trained on a massive text corpus using general language modeling objectives. This stage allows the model to learn the structure, grammar, and semantics of the language, capturing rich representations that can be useful for a wide range of tasks. Once pre-trained, the model can be adapted to a downstream task—such as sentiment analysis, question answering, or text classification—through fine-tuning. Fine-tuning involves continuing the training of the pre-trained model on a smaller, task-specific dataset so the model can specialize its knowledge to solve the new problem effectively.

Note
Definitions: Pre-training, Fine-tuning, and Downstream Task

Pre-training: training a model on a large, generic dataset to learn general language patterns and representations.

Note
Definition

Fine-tuning: further training a pre-trained model on a smaller, task-specific dataset to adapt it to a particular application.

Note
Definition

Downstream Task: a specific NLP task (such as sentiment analysis or named entity recognition) that leverages a pre-trained model.

1234567891011121314151617
from transformers import BertTokenizer, BertForSequenceClassification import torch # Load pre-trained BERT tokenizer and model tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") model = BertForSequenceClassification.from_pretrained("bert-base-uncased") # Prepare a sample sentence sentence = "Transformers are powerful models for NLP tasks." inputs = tokenizer(sentence, return_tensors="pt") # Run inference with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits print("Logits:", logits)
copy
Note
Note

A common pitfall when fine-tuning on small datasets is overfitting, where the model learns the training data too closely and fails to generalize to new examples. Careful regularization and validation are essential to avoid this issue.

question mark

Why do pre-trained models typically require less data to achieve good performance on downstream tasks?

Select the correct answer

Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 2

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

Sektion 1. Kapitel 2
some-alt