Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprenda Serving Models with FastAPI | Model Deployment with FastAPI and Docker
MLOps for Machine Learning Engineers

bookServing Models with FastAPI

When you need to make your machine learning model available for use by other applications or users, serving it as a web service is a common, practical solution. FastAPI is a modern Python web framework that lets you quickly build REST APIs, making it an excellent choice for serving machine learning models. Using FastAPI, you can expose a trained model through HTTP endpoints, so predictions can be requested from anywhere, using any language or tool that can make web requests.

The typical workflow for serving an ML model with FastAPI includes several steps:

  • Train and serialize your model using a library such as scikit-learn;
  • Create a FastAPI app that loads the saved model at startup;
  • Define an endpoint (such as /predict) that accepts input data, runs inference, and returns the prediction;
  • Run the FastAPI app as a web server, so it can respond to HTTP requests.

This approach brings many benefits:

  • You can decouple your model from the training environment and make it accessible to other systems;
  • FastAPI automatically generates interactive documentation for your API, making it easy to test and share;
  • The framework is asynchronous and highly performant, which is important for real-time or production use.

Before you see how to implement this, let's clarify what FastAPI is.

Note
FastAPI

FastAPI is a modern, fast web framework for building APIs with Python.

To see how this works in practice, here is a simple FastAPI application that loads a scikit-learn model and exposes a /predict endpoint. This example assumes you have already trained and saved a model using scikit-learn's joblib or pickle module. The API will accept JSON input for prediction and return the model's output.

from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

# Define the request body schema
class InputData(BaseModel):
    feature1: float
    feature2: float

# Load the trained model (assumes model.pkl exists)
model = joblib.load("model.pkl")

app = FastAPI()

@app.post("/predict")
def predict(input_data: InputData):
    # Prepare input for the model
    data = np.array([[input_data.feature1, input_data.feature2]])
    # Make prediction
    prediction = model.predict(data)
    # Return the result as JSON
    return {"prediction": prediction[0]}

question mark

Which statements about serving machine learning models with FastAPI are accurate?

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 3. Capítulo 1

Pergunte à IA

expand

Pergunte à IA

ChatGPT

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Awesome!

Completion rate improved to 6.25

bookServing Models with FastAPI

Deslize para mostrar o menu

When you need to make your machine learning model available for use by other applications or users, serving it as a web service is a common, practical solution. FastAPI is a modern Python web framework that lets you quickly build REST APIs, making it an excellent choice for serving machine learning models. Using FastAPI, you can expose a trained model through HTTP endpoints, so predictions can be requested from anywhere, using any language or tool that can make web requests.

The typical workflow for serving an ML model with FastAPI includes several steps:

  • Train and serialize your model using a library such as scikit-learn;
  • Create a FastAPI app that loads the saved model at startup;
  • Define an endpoint (such as /predict) that accepts input data, runs inference, and returns the prediction;
  • Run the FastAPI app as a web server, so it can respond to HTTP requests.

This approach brings many benefits:

  • You can decouple your model from the training environment and make it accessible to other systems;
  • FastAPI automatically generates interactive documentation for your API, making it easy to test and share;
  • The framework is asynchronous and highly performant, which is important for real-time or production use.

Before you see how to implement this, let's clarify what FastAPI is.

Note
FastAPI

FastAPI is a modern, fast web framework for building APIs with Python.

To see how this works in practice, here is a simple FastAPI application that loads a scikit-learn model and exposes a /predict endpoint. This example assumes you have already trained and saved a model using scikit-learn's joblib or pickle module. The API will accept JSON input for prediction and return the model's output.

from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

# Define the request body schema
class InputData(BaseModel):
    feature1: float
    feature2: float

# Load the trained model (assumes model.pkl exists)
model = joblib.load("model.pkl")

app = FastAPI()

@app.post("/predict")
def predict(input_data: InputData):
    # Prepare input for the model
    data = np.array([[input_data.feature1, input_data.feature2]])
    # Make prediction
    prediction = model.predict(data)
    # Return the result as JSON
    return {"prediction": prediction[0]}

question mark

Which statements about serving machine learning models with FastAPI are accurate?

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 3. Capítulo 1
some-alt