ML Engineer looking at the monitor

Deploying the Machine Learning model is the next logical and most important step once an ML model is developed and evaluated. Machine Learning models can either be deployed as an API for real-time inferencing or used in a batch mode. But, sadly not a lot of people talk about this. This post will give you a sense of what it takes to deploy the model as a REST API using Flask and understand what it entails.

What is ML Model Deployment?

Most of the times, data scientists build the model in their laptop or local hardware available to them, some might build using machines on the cloud. Model deployment is a process in which the model developed by the data scientists is made available to the end-user/client/stakeholder. This end-user may be internal or external based on the business requirement. Deployment involves packaging the model assets such a way that inferencing becomes straight forward and consistent.

Batch mode is the easiest kind of deployment, where all the predictions/inferencing are done in a batch. This mode is usually used when inferencing is not needed in real-time and okay to run once in a while. One example might be running the sales forecasting once in a month.

Some models like fraud-detection need immediate response and time is a constraint here. In such cases REST APIs are used for deploying the model. In this article we will create a simple API for prediction. This is just a template and can be used for deploying almost any model as an API.

What is Flask?

Flask is a micro web framework written in Python and best suited for small projects. Flask is very lightweight and does not require a steep learning curve. But, it has enough libraries to build a proper microservice for the ML model. Also, Flask has good community support, a lot of tools and libraries that can be easily installed and used to build a robust website or microservices.

Flask Setup

You can easily install Flask using PyPI package(pip)

pip install Flask

Or, use conda if you like

conda install -c anaconda flask


Let us create a simple Flask application with a route to handle the incoming request. Save the following code as app.py and modify wherever needed.

import joblib
import pandas as pd
from flask import Flask, request, jsonify

# creating a Flask application
app = Flask(__name__)

# Load the pickled model which has a predict function
model = joblib.load('model.pkl') # Change this based on where you saved the file

def feature_transform(df):
    # Make any changes required such as feature engineering, fill null values, one-hot encoding etc
    return df

# Route for calling predictions through API
@app.route('/predict', methods=['POST'])
def predict():
    # Get data from POST request into python dictionary
    data = request.get_json()

    # Convert dictionary to pandas dataframe
    df = pd.DataFrame([data])
    # Transform feature if needed
    df = feature_transform(df)

    # making predictions
    prediction = model.predict(df)
    # Returning the prediction as JSON response
    return jsonify(prediction.tolist()[0])

if __name__ == '__main__':
    app.run(port=3000, debug=True)

Predictions over API

Now that we have the code for creating the API for the model, let us run it using the following command

python app.py

Now the Flask app will be available on http://localhost:3000. You can now call the API using curl command or from requests package.

Using curl command

curl -X POST -H "Content-Type: application/json" -d '{"feature_1": value_1, "feature_2": value_2, "feature_n": value_n}' http://localhost:3000/predict

Using requests package

import requests    # Install by running pip install requests

url = "http://localhost:3000/predict"

# Make API request and print the response
features = {"feature_1": value_1, "feature_2": value_2, "feature_n": value_n}
response = requests.post(url, json=features)


Deploying your Machine Learning model using Flask is simple and you could do it in very few lines of code. Same code can be placed in a remote server for everyones access. Of course there will be few more things need to be added such as authentication to protect the application and scale.

Once the model is deployed, next important aspect is monitoring, read more about it here.

Leave a Comment