YOLO11 Object Detection with MxPrepost Library#
Introduction#
In this tutorial, we will show how to use the MxAccl Python API to perform real-time object detection using the pre-trained YOLO11s model, combined with the MxPrepost library for fast and easy pre/post-processing.
Download & Run
Download
This tutorial provides a high-level overview of the application’s key components. To run the full application, download the complete code package and the compiled DFP. After downloading, refer to the Run section below for step-by-step instructions.
Run
Requirements
Ensure the following dependencies are installed:
pip install --extra-index-url https://developer.memryx.com/pip "mxprepost~=2.2.0" "opencv-python~=4.11.0"
Run Command
Run the Python example for real-time object detection using MX3:
# ensure a camera device is connected as default video input is a cam
cd src/python/
python run_objectiondetection.py
1. Download the Model#
The YOLO11 pre-trained models are available on the Official YOLO11 GitHub page. For the sake of the tutorial, we exported and compiled the model for the user to download; it can be found in the compressed folder attached to this tutorial.
Steps are for explanation and learning
These step-by-step snippets are provided to explain the process and help you understand the concepts. For a complete, runnable version, please use the full scripts from the “Download & Run” section above.
2. Compile the Model#
The YOLO11s model was exported with the option to include a post-processing section in the model graph. Hence, it needed to be compiled with the Neural Compiler --autocrop option. After the compilation, the compiler will generate the DFP file for the main section of the model (YOLO11_small_640_640_3_onnx.dfp) and the cropped post-processing section of the model (YOLO11_small_640_640_3_onnx_post.onnx). The compilation step is typically needed once and can be done using the Neural Compiler API or Tool.
Hint
You won’t need the cropped section of the model for this tutorial. MxPrepost implements its own operations to handle all pre/post steps.
from memryx import NeuralCompiler
nc = NeuralCompiler(num_chips=4, models="YOLO11_small_640_640_3_onnx.onnx", verbose=1, autocrop=True)
dfp = nc.run()
In your command line, you need to type,
# note that we've renamed the model to YOLO11_small_640_640_3_onnx for consistency with the tutorial code
mx_nc -m YOLO11_small_640_640_3_onnx.onnx -v --autocrop
This will produce a DFP file ready to be used by the accelerator. In your Python code, you need to point the dfp variable to the generated file path,
dfp = "YOLO11_small_640_640_3_onnx.dfp"
In your Python code, you need to point the dfp variable to the generated file path,
dfp = "YOLO11_small_640_640_3_onnx.dfp"
3. CV Initializations#
Import the needed libraries, initialize the CV pipeline, and define common variables in this step.
import os
import time
import argparse
import cv2
import numpy as np
from queue import Queue
from threading import Thread
# CV and Queues
self.num_frames = 0
self.cap_queue = Queue(maxsize=4)
self.dets_queue = Queue(maxsize=5)
if "/dev/video" in str(video_path):
self.src_is_cam = True
else:
self.src_is_cam = False
self.vidcap = cv2.VideoCapture(video_path)
self.dims = ( int(self.vidcap.get(cv2.CAP_PROP_FRAME_WIDTH)),
int(self.vidcap.get(cv2.CAP_PROP_FRAME_HEIGHT)) )
Here, mxapi is the Python wrapper for MxAccl, used to connect to the accelerator.
mxprepost is a helper library for preprocessing and postprocessing, without needing cropped model sections.
from memryx import mxapi
import mxprepost
Later in the main run function, after the MxAccl object is created, we’ll initialize the MxPrepost object for YOLO11 object detection:
self.prepost = mxprepost.MxPrepost(
accl=accl,
task='yolov11-det',
conf=0.3,
iou=0.4,
)
6. Define an Input Function#
We need to define an input function for the accelerator to use. In this case, our input function will get a new frame from the cam and pre-process it using mxprepost.preprocess.
# Capture frames for streams and pre process
def capture_and_preprocess(self, stream_id=0):
"""
Captures a frame for the video device and pre-processes it.
"""
while True:
got_frame, frame = self.vidcap.read()
if not got_frame:
return None
if self.src_is_cam and self.cap_queue.full():
# drop the frame and try again
continue
else:
self.num_frames += 1
# Put the frame in the cap_queue to be overlayed later
self.cap_queue.put(frame)
# Convert BGR to RGB
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Preprocess frame using mxprepost
rgb_frame = self.prepost.preprocess(rgb_frame)
return rgb_frame
7. Define Output Functions#
We also need to define an output function for the accelerator to use. Our output function will use mxprepost.postprocess to post-process the accelerator output and display it on the screen.
The output function will also overlay and display the output frame besides the MXA data collection and post-processing.
# Post process the output from MXA
def postprocess(self, mxa_output, stream_id=0):
"""
Post-process the MXA output.
"""
# Post-process the MXA output
dets = self.prepost.postprocess(mxa_output, self.dims[1], self.dims[0])
# Push the results to the queue to be used by the display_save thread
self.dets_queue.put(dets)
# Calculate current FPS
self.dt_array[self.dt_index] = time.time() - self.frame_end_time
self.dt_index +=1
if self.dt_index % 15 == 0:
self.fps = 1 / np.average(self.dt_array)
if self.dt_index >= 30:
self.dt_index = 0
self.frame_end_time = time.time()
8. Connect the Accelerator and MxPrepost#
Now we need to connect your input and output functions to the MxAccl Python API.
The mxapi.MxAccl will connect with AsyncAccl, and pass the parameters.
accl = mxapi.MxAccl(
dfp_path=self.dfp_path,
local_mode=True,
use_model_shape=[False, False],
)
The mxprepost.MxPrepost library will handle the preprocessing and postprocessing of the input and output data.
# Initialize mxprepost before connecting streams
self.prepost = mxprepost.MxPrepost(
accl=accl,
task='yolov11-det',
conf=0.3,
iou=0.4,
)
And then we can connect the accelerator with the input and output functions.
# Connect the input and output functions and let the accl run
accl.connect_stream(self.capture_and_preprocess, self.postprocess, stream_id=0, model_id=0)
accl.start()
accl.wait()
The accelerator will automatically call the connected input and output functions in a fully pipelined fashion.
Third-Party Licenses#
This tutorial uses third-party models. Below are the details of the licenses for these dependencies:
Model: YOLO11-small from GitHub
License: AGPLv3
Summary#
This tutorial showed how to use the MxAccl Python API to run a real-time inference using an object-detection with MxPrepost for faster inference.