Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Exporting and Deploying Models | Evaluation, Optimization, and Deployment
Fine-Tuning Transformers

bookExporting and Deploying Models

Свайпніть щоб показати меню

When you are ready to move your fine-tuned transformer model from experimentation to real-world applications, it is essential to export the model and integrate it into your workflows for inference. Exporting a model typically means saving its architecture and learned weights to disk, so you can reload it later without retraining. Once exported, you can use the model for batch inference—processing a large set of data at once—or for real-time inference, such as responding instantly to user queries in a web application. The process involves saving the model, loading it in your deployment environment, and ensuring the inference pipeline matches your training setup, including preprocessing steps like tokenization.

Note
Note

Always test your exported models on a variety of sample inputs before deploying to production. This helps catch any issues with serialization, preprocessing, or environmental differences that might affect predictions.

1234567891011121314151617181920212223
from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch # Load the exported (saved) model and tokenizer model_path = "path/to/your/saved-model" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSequenceClassification.from_pretrained(model_path) # Example texts for inference texts = [ "Transformers are revolutionizing natural language processing.", "Fine-tuning allows models to adapt to specific tasks." ] # Tokenize the texts for the model inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt") # Run inference (no gradients needed) with torch.no_grad(): outputs = model(**inputs) predictions = torch.argmax(outputs.logits, dim=1) print("Predicted classes:", predictions.tolist())
copy
Note
Note

Inference errors can occur if there are mismatches between the library or model versions used during training, export, and deployment. Always ensure your deployment environment matches your training environment as closely as possible.

question mark

What is the most important thing you should test before deploying a model to production?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 4. Розділ 3

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Секція 4. Розділ 3
some-alt