Select Page

A step-by-step tutorial on how to convert machine learning models trained using PyCaret to ONNX format for high-performance scoring on edge devices


In this tutorial, I will show you how you can train machine learning models using PyCaret — an open-source, low-code machine learning library in Python—and convert them in ONNX format for deployment on an edge device or any other non-Python environment. For example, you can train machine learning models using PyCaret in Python and deploy them in R, Java, or C. The Learning Goals of this tutorial are:

  • What is PyCaret and how to get started?
  • What are different types of model formats (pickle, onnx, pmml, etc.)
  • What is ONNX (pronounced as ONEX) and what are its benefits?
  • Train machine learning model using PyCaret and convert it in ONNX for deployment on edge.


PyCaret is an open-source, low-code machine learning library and end-to-end model management tool built in Python to automate machine learning workflows. PyCaret is known for its ease of use, simplicity, and ability to quickly and efficiently build and deploy end-to-end machine learning pipelines. To learn more about PyCaret, check out their GitHub.


skl2onnx is an open-source project that converts scikit-learn models to ONNX. Once in the ONNX format, you can use tools like ONNX Runtime for high-performance scoring. This project was started by the engineers and data scientists at Microsoft in 2017. To learn more about this project, check out their GitHub.


You will need to install the following libraries for this tutorial. The installation will take only a few minutes.

# install pycaret
pip install pycaret

# install skl2onnx
pip install skl2onnx

# install onnxruntime
pip install onnxruntime

Different Model Formats

1. Pickle

This is the most common format and default way of saving model objects into files for many Python libraries including PyCaret. Pickle converts a Python object to a bitstream and allows it to be stored to disk and reloaded at a later time. It provides a good format to store machine learning models provided that the inference applications are also built-in python.


Predictive model markup language (PMML) is another format for machine learning models, relatively less common than Pickle. PMML has been around since 1997 and so has a large footprint of applications leveraging the format. Applications such as SAPand PEGA CRM are able to leverage certain versions of the PMML. There are open-source libraries available that can convert scikit-learn models (PyCaret) to PMML. The biggest drawback of the PMML format is that it doesn’t support all machine learning models.


ONNX, the Open Neural Network Exchange Format is an open format that supports the storing and porting of machine learning models across libraries and languages. This means that you can train your machine learning model using any framework in any language and then convert it into ONNX that can be used to generate inference in any environment (be it Java, C, .Net, Android, etc.). This language-agnostic capability of ONNX makes it really powerful compared to the other formats (For example You cannot use a model saved as a Pickle file in any other language than Python).

What is ONNX?

ONNX is an open format to represent both deep learning and traditional models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners such as Microsoft, Facebook, and AWS.

ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining the path from research to production helps increase the speed of innovation in the AI community. ONNX helps to solve the challenge of hardware dependency related to AI models and enables deploying the same AI models to several HW accelerated targets.

There are many excellent machine learning libraries in various languages — PyTorch, TensorFlow, scikit-learn, PyCaret, etc. The idea is that you can train a model with any tool, language, or framework and then deploy it using another language or application for inference and prediction. For example, let’s say you have a web application built with .Net, an Android app, or even an edge device and you want to integrate your machine learning model predictions into those downstream systems. You can do that by converting your model into an ONNX format. You cannot do that with Pickle or PMML format.

Key Benefits

1. Interoperability

Develop in your preferred framework without worrying about downstream inferencing implications. ONNX enables you to use your preferred framework with your chosen inference engine.

2. Hardware Access

ONNX makes it easier to access hardware optimizations. Use ONNX-compatible runtimes and libraries designed to maximize performance across hardware. This means that you can even use ONNX models on GPU for inference if latency is something you care about.


For this tutorial, I am using a regression dataset from PyCaret’s repository called insurance. You can download the dataset from here.

# loading dataset
from pycaret.datasets import get_data
data = get_data('insurance')

# initialize setup / data preparation
from pycaret.regression import *
s = setup(data, target = 'charges')

Model Training & Selection

Now that data is ready for modeling, let’s start the training process with the compare_models function. It will train all the algorithms available in the model library and evaluates multiple performance metrics using k-fold cross-validation.

# compare all models
best = compare_models()

Based on cross-validation metrics the best model is Gradient Boosting Regressor. You can save the model as a pickle file with the save_model function.

# save model to drive
save_model(best, 'c:/users/models/insurance')

Generate Predictions using Pickle format

You can load the saved model back in the Python environment with the load_model function and generate inference using the predict_model function.

# load the model
from pycaret.regression import load_model
loaded_model = load_model(‘c:/users/models/insurance’)

# generate predictions / inference
from pycaret.regression import predict_model
pred = predict_model(loaded_model, data=data)

ONNX Conversion

So far what we have seen is saving and loading trained models in Pickle format (which is the default format for PyCaret). However, using the skl2onnx library we can convert the model in ONNX:

# convert best model to onnx
from skl2onnx import to_onnx
X_sample = get_config('X_train')[:1]
model_onnx = to_onnx(best, X_sample.to_numpy())

We can also save the model_onnx to local drive:

# save the model to drive
with open("c:/users/models/insurance.onnx", "wb") as f:

Now to generate the inference from insurance.onnx file we will use the onnxruntime library in Python (just to demonstrate the point). Essentially you can now use the insurance.onnx in any other platform or environment.

# generate inference on onnx
from onnxruntime import InferenceSession
sess = InferenceSession(model_onnx.SerializeToString())
X_test = get_config('X_test').to_numpy()
predictions_onnx =, {'X': X_test})[0]

# print predictions_onnx

Notice that the output from predictions_onnx is a numpy array compared to the pandas DataFrame when we have used the predict_model function from PyCaret but if you match the values, the numbers are all same (with ONNX sometimes you will find minor differences beyond the 4th decimal point — very rarely).