YOLO11 Object Detection with MxPrepost Library#

Introduction#

In this tutorial, we use the MxAccl Python API to run real-time object detection with the pre-trained YOLO11s model. The pipeline uses the MxPrepost library for pre/post-processing; for details, see MxPrepost Usage.

1. Download the Model#

The YOLO11 pre-trained models are available on the Official YOLO11 GitHub page. For the sake of the tutorial, we exported and compiled the model for the user to download; it can be found in the compressed folder attached to this tutorial.

2. Compile the Model#

The YOLO11s model was exported with the option to include a post-processing section in the model graph. Hence, it needed to be compiled with the Neural Compiler --autocrop option. After the compilation, the compiler will generate the DFP file for the main section of the model (YOLO11_small_640_640_3_onnx.dfp) and the cropped post-processing section of the model (YOLO11_small_640_640_3_onnx_post.onnx). The compilation step is typically needed once and can be done using the Neural Compiler API or Tool.

Hint

You won’t need the cropped section of the model for this tutorial. MxPrepost implements its own operations to handle all pre/post steps.

API

from memryx import NeuralCompiler
nc = NeuralCompiler(num_chips=4, models="YOLO11_small_640_640_3_onnx.onnx", verbose=1, autocrop=True)
dfp = nc.run()

CLI Tool

In your command line, you need to type,

# note that we've renamed the model to YOLO11_small_640_640_3_onnx for consistency with the tutorial code
mx_nc -m YOLO11_small_640_640_3_onnx.onnx -v --autocrop

This will produce a DFP file ready to be used by the accelerator. In your Python code, you need to point the dfp variable to the generated file path,

dfp = "YOLO11_small_640_640_3_onnx.dfp"

Pre-Compiled DFP

In your Python code, you need to point the dfp variable to the generated file path,

dfp = "YOLO11_small_640_640_3_onnx.dfp"

3. CV Initializations#

Import the needed libraries, initialize the CV pipeline, and define common variables in this step.

import os
import time
import argparse
import cv2
import numpy as np
from queue import Queue
from threading import Thread

# CV and Queues
self.num_frames = 0
self.cap_queue = Queue(maxsize=4)
self.dets_queue = Queue(maxsize=5)
if "/dev/video" in str(video_path):
    self.src_is_cam = True
else:
    self.src_is_cam = False
self.vidcap = cv2.VideoCapture(video_path) 
self.dims = ( int(self.vidcap.get(cv2.CAP_PROP_FRAME_WIDTH)), 
        int(self.vidcap.get(cv2.CAP_PROP_FRAME_HEIGHT)) )

Here, mxapi is the Python wrapper for MxAccl, used to connect to the accelerator.

mxprepost is a helper library for preprocessing and postprocessing, without needing cropped model sections.

from memryx import mxapi
import mxprepost

Later in the main run function, after the MxAccl object is created, we’ll initialize the MxPrepost object for YOLO11 object detection:

self.prepost = mxprepost.MxPrepost(
    accl=accl,
    task='yolov11-det',
    conf=0.3,
    iou=0.4,
)

6. Define an Input Function#

We need to define an input function for the accelerator to use. In this case, our input function will get a new frame from the cam and pre-process it using mxprepost.preprocess.

# Capture frames for streams and pre process
def capture_and_preprocess(self, stream_id=0):
    """
    Captures a frame for the video device and pre-processes it.
    """

    while True:

        got_frame, frame = self.vidcap.read()

        if not got_frame:
            return None

        if self.src_is_cam and self.cap_queue.full():
            # drop the frame and try again
            continue
        else:
            self.num_frames += 1
            
            # Put the frame in the cap_queue to be overlayed later
            self.cap_queue.put(frame)
            
            # Convert BGR to RGB
            rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            
            # Preprocess frame using mxprepost
            rgb_frame = self.prepost.preprocess(rgb_frame)
            return rgb_frame

7. Define Output Functions#

We also need to define an output function for the accelerator to use. Our output function will use mxprepost.postprocess to post-process the accelerator output and display it on the screen.

The output function will also overlay and display the output frame besides the MXA data collection and post-processing.

# Post process the output from MXA
def postprocess(self, mxa_output, stream_id=0):
    """
    Post-process the MXA output.
    """

    # Post-process the MXA output
    dets = self.prepost.postprocess(mxa_output, self.dims[1], self.dims[0])

    # Push the results to the queue to be used by the display_save thread
    self.dets_queue.put(dets)

    # Calculate current FPS
    self.dt_array[self.dt_index] = time.time() - self.frame_end_time
    self.dt_index +=1
    
    if self.dt_index % 15 == 0:
        self.fps = 1 / np.average(self.dt_array)

        if self.dt_index >= 30:
            self.dt_index = 0
    
    self.frame_end_time = time.time()

8. Connect the Accelerator and MxPrepost#

Now we need to connect your input and output functions to the MxAccl Python API.

The mxapi.MxAccl will connect with AsyncAccl, and pass the parameters.

accl = mxapi.MxAccl(
    dfp_path=self.dfp_path,
    local_mode=True,
    use_model_shape=[False, False],
)

The mxprepost.MxPrepost library will handle the preprocessing and postprocessing of the input and output data.

# Initialize mxprepost before connecting streams
self.prepost = mxprepost.MxPrepost(
    accl=accl,
    task='yolov11-det',
    conf=0.3,
    iou=0.4,
)

And then we can connect the accelerator with the input and output functions.

# Connect the input and output functions and let the accl run
accl.connect_stream(self.capture_and_preprocess, self.postprocess, stream_id=0, model_id=0)

accl.start()
accl.wait()

The accelerator will automatically call the connected input and output functions in a fully pipelined fashion.

Third-Party Licenses#

This tutorial uses third-party models. Below are the details of the licenses for these dependencies:

Model: YOLO11-small from GitHub
- License: AGPLv3

Summary#

This tutorial showed how to use the MxAccl Python API to run a real-time inference using an object-detection model with MxPrepost for faster inference.