MxPrepost Usage#
MxPrepost is a high-performance pre/post-processing library for YOLO models on MemryX systems. It is designed to work alongside the MemryX Runtime callbacks and provides optimized implementations for common operations such as preprocessing/formatting, output decoding, NMS, and drawing results on frames.
The library supports common YOLO tasks (detection, segmentation, and pose), and is intended to simplify deployment while maintaining high throughput.
Note
*_post.onnx generated with --autocrop) by running an optimized version for CPUs directly inside callback code.For background and alternatives, see:
See also
For a runnable app using MxPrepost, see this YOLO11 tutorial.
Install MxPrepost#
Install and Verify
Important
Install MxPrepost in the same Python environment as the MemryX SDK.
1. Verify SDK version
mx_nc --version
Optional: Upgrade SDK tools (only if SDK is below 2.2)
pip3 install --extra-index-url https://developer.memryx.com/pip --upgrade memryx
2. Install MxPrepost
pip3 install --extra-index-url https://developer.memryx.com/pip mxprepost
3. Confirm installation
python3 -m pip show mxprepost
Typical Integration Pattern#
The recommended flow is:
Create a MxAccl runtime object.
Create one
mxprepost.MxPrepostobject (per model/task).In input callback, call
preprocess(frame)and return it to runtime.In output callback, call
postprocess(ofmaps, ori_h, ori_w)orpostprocess(ofmaps, ori_frame).Optionally call
draw(frame, result)to overlay output annotations.
import cv2
import mxprepost
from memryx import mxapi
accl = mxapi.MxAccl("model.dfp", use_model_shape=[False, False])
prepost = mxprepost.MxPrepost(accl=accl, task="yolov11-det")
def input_callback(stream_id):
ok, frame = cap.read()
if not ok:
return None
return prepost.preprocess(frame)
def output_callback(ofmaps, stream_id):
result = prepost.postprocess(ofmaps, ori_h, ori_w)
# result.boxes / result.masks / result.keypoints
return True
accl.connect_stream(input_callback, output_callback, stream_id=0)
The recommended flow is:
Create a MxAccl runtime object.
Create one
MX::Prepost::MxPrepostobject (per model/task) usingcreateorcreate_safe.In input callback, call
preprocess(frame)and feed the returned tensor to runtime input.In output callback, gather output feature maps as
float*pointers and callpostprocess(ofmaps, result, ori_w, ori_h)(orpostprocess(ofmaps, result, original_frame)).Optionally call
draw(frame, result)to overlay output annotations.
#include <memx/accl/MxAccl.h>
#include <memx/prepost/MxPrepost.h>
#include <opencv2/opencv.hpp>
#include <memory>
#include <stdexcept>
#include <string>
#include <vector>
using namespace MX::Runtime;
using namespace MX::Prepost;
// device_ids_to_use is positional here; keep default {0}.
// use_model_shape is the key setting to control input/output shape behavior.
MxAccl accl{"model.dfp", {0}, {false, false}};
YoloUserConfig cfg; // COCO defaults
cfg.conf = 0.3f;
cfg.iou = 0.4f;
std::unique_ptr<MxPrepost> pp;
std::string err;
if (!MxPrepost::create_safe(&accl, "yolov11-det", cfg, pp, err)) {
throw std::runtime_error("Failed to create MxPrepost: " + err);
}
int ori_w = static_cast<int>(cap.get(cv::CAP_PROP_FRAME_WIDTH));
int ori_h = static_cast<int>(cap.get(cv::CAP_PROP_FRAME_HEIGHT));
auto in_cb = [&](std::vector<const MX::Types::FeatureMap*>& input, int stream_id) -> bool {
cv::Mat frame;
if (!cap.read(frame)) {
return false;
}
cv::Mat preprocessed = pp->preprocess(frame);
input[0]->set_data(reinterpret_cast<float*>(preprocessed.data));
return true;
};
auto out_cb = [&](std::vector<const MX::Types::FeatureMap*>& output, int stream_id) -> bool {
// Copy runtime outputs into your own float buffers, then fill:
std::vector<float*> ofmap_ptrs;
// ... populate ofmap_ptrs from output[i]->get_data(...)
Result result;
pp->postprocess(ofmap_ptrs, result, ori_w, ori_h);
// or: pp->postprocess(ofmap_ptrs, result, original_frame);
// Optional visualization:
// cv::Mat annotated = original_frame.clone();
// pp->draw(annotated, result);
return true;
};
accl.connect_stream(in_cb, out_cb, 0);
accl.start();
accl.wait();
For accessing different results depending on the task:
# Access outputs (depending on task)
# ---------------------------
# Detection (Bounding Boxes)
# ---------------------------
if result.boxes is not None:
for box in result.boxes:
print("xyxy:", box.xyxy) # [x1, y1, x2, y2]
print("xywh:", box.xywh) # [x_center, y_center, w, h]
print("conf:", box.conf) # confidence score
print("cls_id:", box.cls_id) # class index
print("cls_name:", box.cls_name)# class label string
print("----")
# ---------------------------
# Segmentation (Masks)
# ---------------------------
if result.masks is not None:
for mask in result.masks:
# mask is a polygon (list of Point2f), not a binary HxW array
pts = [(p.x, p.y) for p in mask.xys]
print("mask cls_id:", int(mask.cls_id))
print("num polygon points:", len(pts))
if pts:
print("first 5 points:", pts[:5])
print("----")
# ---------------------------
# Pose Estimation (Keypoints)
# ---------------------------
if result.keypoints is not None:
for det_id, kps in enumerate(result.keypoints):
print(f"detection {det_id} keypoints:", len(kps))
# each kp has kp.xy (Point2f) and kp.conf
for kp_id, kp in enumerate(kps):
print(f" kp[{kp_id}] = (x={kp.xy.x:.2f}, y={kp.xy.y:.2f}, conf={kp.conf:.3f})")
print("----")
// Detection (Bounding Boxes)
if (!result.boxes.empty()) {
for (const auto& box : result.boxes) {
// box.xyxy, box.xywh, box.conf, box.cls_id, box.cls_name
}
}
// Segmentation (Masks as polygons)
if (!result.masks.empty()) {
for (const auto& mask : result.masks) {
// mask.xys (vector<Point2f>), mask.cls_id, mask.cls_name
}
}
// Pose Estimation (Keypoints)
if (!result.keypoints.empty()) {
for (const auto& kps : result.keypoints) {
for (const auto& kp : kps) {
// kp.xy.x, kp.xy.y, kp.conf
}
}
}
You can also check:
Full example showing MxAccl + MxPrepost in a real-time object detection pipeline.
Configuration Reference#
Argument |
Type |
Description |
|---|---|---|
|
|
|
|
|
YOLO task string, e.g. |
|
|
Confidence threshold (default: |
|
|
IoU threshold for NMS (default: |
|
|
Path to class labels text file (one label per line). |
|
|
Override class labels directly in code. |
|
|
Keep only selected class IDs. |
|
|
Enable class-agnostic NMS when |
|
|
Use approximate sigmoid for slightly faster processing. |
|
|
Select model when multiple models are compiled into one DFP. |
|
|
Advanced override for output layer-to-port mapping. |
For Custom YOLO Models#
Use these arguments to align MxPrepost with your custom model and dataset:
classmap_path: load class labels from a text file (one label per line).custom_class_labels: provide labels directly in code.valid_classes: return only selected class IDs.model_id: choose the correct model when a DFP contains multiple models.
Note
If both classmap_path and custom_class_labels are provided, custom_class_labels takes precedence.
For custom models, ensure task (det / seg / pose) and class count are consistent with model outputs.
If those are correct, auto-mapping usually works without override_layer_mapping.
prepost = mxprepost.MxPrepost(
accl=accl,
task="yolov11-det",
custom_class_labels=["person", "helmet", "vest"],
valid_classes=[0, 1],
model_id=0,
)
Troubleshooting#
Auto Layer Detection Fails#
Use this only when auto layer detection fails and your task, class count, and model_id are already correct.
Inspect model output layer names and shapes:
dfp_inspect path_dfp_file
or using C++ code:
#include <iostream>
#include <memx/accl/MxAccl.h>
MX::Runtime::MxAccl accl{"model.dfp", {0}, {false, false}};
auto info = accl.get_model_info(0); // use your model_id
for (size_t i = 0; i < info.output_layer_names.size(); ++i) {
std::cout << i << " | "
<< info.output_layer_names[i] << " | "
<< info.out_featuremap_shapes[i].to_string() << "\n";
}
Build
override_layer_mappingwith those exact layer names:
Detection: each stride maps to
[coord_layer_name, conf_layer_name].Segmentation: each stride maps to
[coord_layer_name, conf_layer_name, mask_coef_layer_name]plus0: [mask_proto_layer_name].Pose: each stride maps to
[coord_layer_name, conf_layer_name, keypoint_layer_name].
Include all required stride entries (commonly
8,16,32):
# Example: explicit mapping for YOLO detection
prepost = mxprepost.MxPrepost(
accl=accl,
task="yolov11-det",
override_layer_mapping={
8: ["/model.22/cv2.0/cv2.0.2/Conv_output_0", "/model.22/cv3.0/cv3.0.2/Conv_output_0"],
16: ["/model.22/cv2.1/cv2.1.2/Conv_output_0", "/model.22/cv3.1/cv3.1.2/Conv_output_0"],
32: ["/model.22/cv2.2/cv2.2.2/Conv_output_0", "/model.22/cv3.2/cv3.2.2/Conv_output_0"],
},
)
# Segmentation additionally needs global mask proto:
# override_layer_mapping = {
# 8: ["coord_s8", "conf_s8", "maskcoef_s8"],
# 16: ["coord_s16", "conf_s16", "maskcoef_s16"],
# 32: ["coord_s32", "conf_s32", "maskcoef_s32"],
# 0: ["mask_proto_global"],
# }
Supported Tasks#
Detection
Detection, Segmentation, Pose
Detection
Detection
Detection, Segmentation, Pose
Custom class labels and custom resolutions
Compatibility Notes#
Warning
MxPrepost is designed for MxAccl runtime flows. Legacy Python bindings such as SyncAccl, AsyncAccl, and MultistreamAsyncAccl are not supported.
See also
YOLO Pre/Post API: Full Python and C++ API reference.
YOLO11 Object Detection with MxPrepost Library: End-to-end MxPrepost tutorial.
Callback Functions: Runtime callback integration model.
Runtime Usage: Runtime usage overview (async flow, multi-stream, and multi-device guidance).