Compile and Benchmark Using APIs#

In this tutorial, we’ll programmatically compile and benchmark a neural network model using MemryX’s Neural Compiler and Benchmark APIs. We’ll demonstrate the use of MobileNet and run 1000 frames of random data to measure its performance on the MemryX Accelerator (MXA).

Note

The Hello, MXA! covers the same steps as this tutorial but using the command line interface (CLI). Here, we will achieve the same results programmatically using the APIs for greater flexibility.

Download and Compile the Model#

First, we will download and compile the MobileNet model using the Neural Compiler API. This step transforms the neural network model into a Dataflow Program (DFP), which is optimized to run on the MXA.

from tensorflow import keras
from memryx import NeuralCompiler

# Load MobileNet model
mobilenet = keras.applications.MobileNet()

# Initialize NeuralCompiler with verbose mode
nc = NeuralCompiler(models=mobilenet, verbose=1)

# Compile the model into a DFP
mobilenet_dfp = nc.run()

# Optionally, save the compiled DFP to a file for later use
mobilenet_dfp.write("mobilenet.dfp")

This will compile the MobileNet model to run on the MXA chips. The verbose=1 argument allows you to view detailed output of the compilation process. The output should resemble the following:

════════════════════════════════════════
Converting Model: (Done)
Optimizing Graph: (Done)
Cores optimization: (Done)
Flow optimization: (Done)
. . . . . . . . . . . . . . . . . . . .
Ports mapping: (Done)
MPU 0 input port 0: {'model_index': 0, 'layer_name': 'input_layer', 'shape': [224, 224, 1, 3]}
MPU 3 output port 0: {'model_index': 0, 'layer_name': 'predictions', 'shape': [1, 1, 1, 1000]}
────────────────────────────────────────
Assembling DFP: (Done)
════════════════════════════════════════

At this point, the MobileNet model is compiled into a DFP and ready to be deployed.

Deploy and Benchmark#

Now that the model is compiled, we can deploy and benchmark it on the MXA by running 1000 frames of random data. This step allows us to measure the Frames Per Second (FPS) performance of the accelerator.

from memryx import Benchmark

# Initialize the Benchmark with the compiled DFP
benchmark = Benchmark(dfp=mobilenet_dfp)

# Run the benchmark with 1000 frames
with benchmark as accl:
    _, _, fps = accl.run(frames=1000)
    print(f"FPS of MobileNet Accelerated on MXA: {fps:.2f}")

This code runs 1000 frames of random data through the MXA and reports the FPS performance. The output should look like this:

FPS of MobileNet Accelerated on MXA: 2242.32

This result indicates the FPS performance of the MobileNet model running on MXA hardware, demonstrating the speed of the accelerator.

Summary#

In this tutorial, we explored how to programmatically compile and benchmark a neural network model using MemryX’s Neural Compiler and Benchmark APIs. By compiling the MobileNet model into a Dataflow Program (DFP) and running 1000 frames of random data, we measured the Frames Per Second (FPS) performance on the MemryX Accelerator (MXA).

The full code implementation for this tutorial is available here:

compile_benchmark_api.py