Converting PyTorch Models to ONNX#

Introduction#

As of version 1.0.0, direct support for PyTorch 1 models on MXA chips has been completely removed. Until support for PyTorch 2 is released, the recommended way to use PyTorch models is by exporting them to ONNX (Open Neural Network Exchange) format. This tutorial will guide you through the steps to convert your PyTorch models to ONNX so they can be used with the Neural Compiler for MXA chips.

PyTorch models are typically saved as Python scripts and classes, with the computational graph generated during runtime. This dynamic nature can pose challenges when trying to compile them with tools that expect a static computational graph. Unlike other frameworks that save models in a single file containing both the architecture and weights (e.g., TensorFlow’s .pb, Keras’s .h5, or ONNX’s .onnx files), PyTorch relies on the model’s code and the state dictionary containing the weights.

In this tutorial, we will demonstrate how to export a PyTorch model to ONNX format using MobileNetV2 as an example model, but the steps can be applied to any PyTorch model.

Step 1: Load or Define Your PyTorch Model#

Depending on your situation, you may be using a pre-trained model or defining your own custom model.

Using a Pre-trained Model

If you are using a pre-trained model from torchvision, you can load it as follows:

import torch
import torchvision
from torchvision.models import MobileNet_V2_Weights

# Load a pre-trained MobileNetV2 model
weights = MobileNet_V2_Weights.DEFAULT
model = torchvision.models.mobilenet_v2(weights=weights)

Defining a Custom Model

If you have your own custom model, define it in your script:

import torch
import torch.nn as nn

class MyCustomModel(nn.Module):
    def __init__(self):
        super(MyCustomModel, self).__init__()
        # Define your layers here
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1)
        self.relu = nn.ReLU()
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc = nn.Linear(16 * 112 * 112, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu(x)
        x = self.pool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

# Instantiate the model
model = MyCustomModel()

Step 2: Prepare a Sample Input Tensor#

The torch.onnx.export function requires a sample input tensor to trace the model’s computational graph. Create an input tensor with the appropriate shape and data type. For an image classification model like MobileNetV2, the input shape is typically (batch_size, channels, height, width).

# Set the model to evaluation mode
model.eval()

# Create a sample input tensor
sample_input = torch.randn(1, 3, 224, 224)

Step 3: Export the Model to ONNX Format#

Use the torch.onnx.export function to export the model. Specify the model, sample input, output file name, and any other parameters as needed.

# Export the model to ONNX format
torch.onnx.export(
    model,                   # The model to be exported
    sample_input,            # The sample input tensor
    "model.onnx",            # The output file name
    export_params=True,      # Store the trained parameter weights inside the model file
    opset_version=17,        # The ONNX version to export the model to
    do_constant_folding=True,  # Whether to execute constant folding for optimization
    input_names=['input'],     # The model's input names
    output_names=['output'],   # The model's output names
)

Make sure to set the opset_version to a value supported by your tools. As of this writing, ONNX supports opset versions up to 17.

Step 4: Verify the Exported ONNX Model (Optional)#

Optionally, you can verify that the exported ONNX model works correctly by loading it and running a forward pass.

import onnx
import onnxruntime

# Load the ONNX model
onnx_model = onnx.load("model.onnx")

# Check that the model is well-formed
onnx.checker.check_model(onnx_model)

# Run inference using ONNX Runtime
ort_session = onnxruntime.InferenceSession("model.onnx")

# Prepare the input
ort_inputs = {ort_session.get_inputs()[0].name: sample_input.numpy()}

# Run the model
ort_outs = ort_session.run(None, ort_inputs)

print("ONNX model output:", ort_outs)

This ensures that the exported model can be loaded and run as expected.

Compiling the ONNX Model Using the Neural Compiler#

You can compile the ONNX model using the Neural Compiler either via the command-line interface (CLI) or the Python API.

Command-Line Interface

You can compile the model using the Neural Compiler CLI with the following command:

mx_nc -m model.onnx

Replace model.onnx with the path to your ONNX model.

Python API Using Files

Alternatively, you can use the Neural Compiler’s Python API in a script:

from memryx import NeuralCompiler

# Initialize the Neural Compiler with the model path and output filename
nc = NeuralCompiler(models="model.onnx", dfp_fname="model.dfp")

# Compile the model
dfp = nc.run()

Python API Using a Loaded ONNX Model

You can directly pass the loaded ONNX model to the Neural Compiler and use the DFP object without saving it to a file:

import onnx
from memryx import NeuralCompiler

# Load the ONNX model
onnx_model = onnx.load("model.onnx")

# Initialize the Neural Compiler with the loaded ONNX model
nc = NeuralCompiler(models=onnx_model)

# Compile the model
dfp = nc.run()

This allows you to integrate the compilation process into your Python workflow.

Note

If your model is defined across multiple files with submodules, ensure all necessary code is available when exporting. You may need to write a script that imports all required modules and defines the model.