Aprenda Serving Models with FastAPI | Model Deployment with FastAPI and Docker

When you need to make your machine learning model available for use by other applications or users, serving it as a web service is a common, practical solution. FastAPI is a modern Python web framework that lets you quickly build REST APIs, making it an excellent choice for serving machine learning models. Using FastAPI, you can expose a trained model through HTTP endpoints, so predictions can be requested from anywhere, using any language or tool that can make web requests.

The typical workflow for serving an ML model with FastAPI includes several steps:

Train and serialize your model using a library such as scikit-learn;
Create a FastAPI app that loads the saved model at startup;
Define an endpoint (such as /predict) that accepts input data, runs inference, and returns the prediction;
Run the FastAPI app as a web server, so it can respond to HTTP requests.

This approach brings many benefits:

You can decouple your model from the training environment and make it accessible to other systems;
FastAPI automatically generates interactive documentation for your API, making it easy to test and share;
The framework is asynchronous and highly performant, which is important for real-time or production use.

Before you see how to implement this, let's clarify what FastAPI is.

FastAPI

FastAPI is a modern, fast web framework for building APIs with Python.

To see how this works in practice, here is a simple FastAPI application that loads a scikit-learn model and exposes a /predict endpoint. This example assumes you have already trained and saved a model using scikit-learn's joblib or pickle module. The API will accept JSON input for prediction and return the model's output.

from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

# Define the request body schema
class InputData(BaseModel):
    feature1: float
    feature2: float

# Load the trained model (assumes model.pkl exists)
model = joblib.load("model.pkl")

app = FastAPI()

@app.post("/predict")
def predict(input_data: InputData):
    # Prepare input for the model
    data = np.array([[input_data.feature1, input_data.feature2]])
    # Make prediction
    prediction = model.predict(data)
    # Return the result as JSON
    return {"prediction": prediction[0]}

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 3. Capítulo 1

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Awesome!

Completion rate improved to 6.25

Deslize para mostrar o menu