MxPrepost Usage#

MxPrepost is a high-performance pre/post-processing library for YOLO models on MemryX systems. It is designed to work alongside the MemryX Runtime callbacks and provides optimized implementations for common operations such as preprocessing/formatting, output decoding, NMS, and drawing results on frames.

The library supports common YOLO tasks (detection, segmentation, and pose), and is intended to simplify deployment while maintaining high throughput.

Note

MxPrepost is designed to work hand-in-hand with MxAccl callback-based runtime flows.

It is an optimized replacement for the cropped post-model code (for example, *_post.onnx generated with --autocrop) by running an optimized version for CPUs directly inside callback code.

For background and alternatives, see:

Install MxPrepost#

Typical Integration Pattern#

Python

The recommended flow is:

Create a MxAccl runtime object.
Create one mxprepost.MxPrepost object (per model/task).
In input callback, call preprocess(frame) and return it to runtime.
In output callback, call postprocess(ofmaps, ori_h, ori_w) or postprocess(ofmaps, ori_frame).
Optionally call draw(frame, result) to overlay output annotations.

import cv2
import mxprepost
from memryx import mxapi

accl = mxapi.MxAccl("model.dfp", use_model_shape=[False, False])
prepost = mxprepost.MxPrepost(accl=accl, task="yolov11-det")

def input_callback(stream_id):
    ok, frame = cap.read()
    if not ok:
        return None
    return prepost.preprocess(frame)

def output_callback(ofmaps, stream_id):
    result = prepost.postprocess(ofmaps, ori_h, ori_w)
    # result.boxes / result.masks / result.keypoints
    return True

accl.connect_stream(input_callback, output_callback, stream_id=0)

C++

The recommended flow is:

Create a MxAccl runtime object.
Create one MX::Prepost::MxPrepost object (per model/task) using create or create_safe.
In input callback, call preprocess(frame) and feed the returned tensor to runtime input.
In output callback, gather output feature maps as float* pointers and call postprocess(ofmaps, result, ori_w, ori_h) (or postprocess(ofmaps, result, original_frame)).
Optionally call draw(frame, result) to overlay output annotations.

#include <memx/accl/MxAccl.h>
#include <memx/prepost/MxPrepost.h>
#include <opencv2/opencv.hpp>
#include <memory>
#include <stdexcept>
#include <string>
#include <vector>

using namespace MX::Runtime;
using namespace MX::Prepost;

// device_ids_to_use is positional here; keep default {0}.
// use_model_shape is the key setting to control input/output shape behavior.
MxAccl accl{"model.dfp", {0}, {false, false}};

YoloUserConfig cfg;  // COCO defaults
cfg.conf = 0.3f;
cfg.iou = 0.4f;

std::unique_ptr<MxPrepost> pp;
std::string err;
if (!MxPrepost::create_safe(&accl, "yolov11-det", cfg, pp, err)) {
    throw std::runtime_error("Failed to create MxPrepost: " + err);
}

int ori_w = static_cast<int>(cap.get(cv::CAP_PROP_FRAME_WIDTH));
int ori_h = static_cast<int>(cap.get(cv::CAP_PROP_FRAME_HEIGHT));

auto in_cb = [&](std::vector<const MX::Types::FeatureMap*>& input, int stream_id) -> bool {
    cv::Mat frame;
    if (!cap.read(frame)) {
        return false;
    }

    cv::Mat preprocessed = pp->preprocess(frame);
    input[0]->set_data(reinterpret_cast<float*>(preprocessed.data));
    return true;
};

auto out_cb = [&](std::vector<const MX::Types::FeatureMap*>& output, int stream_id) -> bool {
    // Copy runtime outputs into your own float buffers, then fill:
    std::vector<float*> ofmap_ptrs;
    // ... populate ofmap_ptrs from output[i]->get_data(...)

    Result result;
    pp->postprocess(ofmap_ptrs, result, ori_w, ori_h);
    // or: pp->postprocess(ofmap_ptrs, result, original_frame);

    // Optional visualization:
    // cv::Mat annotated = original_frame.clone();
    // pp->draw(annotated, result);
    return true;
};

accl.connect_stream(in_cb, out_cb, 0);
accl.start();
accl.wait();

For accessing different results depending on the task:

Python

# Access outputs (depending on task)
# ---------------------------
# Detection (Bounding Boxes)
# ---------------------------
if result.boxes is not None:
    for box in result.boxes:
        print("xyxy:", box.xyxy)        # [x1, y1, x2, y2]
        print("xywh:", box.xywh)        # [x_center, y_center, w, h]
        print("conf:", box.conf)        # confidence score
        print("cls_id:", box.cls_id)    # class index
        print("cls_name:", box.cls_name)# class label string
        print("----")

# ---------------------------
# Segmentation (Masks)
# ---------------------------
if result.masks is not None:
    for mask in result.masks:
        # mask is a polygon (list of Point2f), not a binary HxW array
        pts = [(p.x, p.y) for p in mask.xys]
        print("mask cls_id:", int(mask.cls_id))
        print("num polygon points:", len(pts))
        if pts:
            print("first 5 points:", pts[:5])
        print("----")

# ---------------------------
# Pose Estimation (Keypoints)
# ---------------------------
if result.keypoints is not None:
    for det_id, kps in enumerate(result.keypoints):
        print(f"detection {det_id} keypoints:", len(kps))
        # each kp has kp.xy (Point2f) and kp.conf
        for kp_id, kp in enumerate(kps):
            print(f"  kp[{kp_id}] = (x={kp.xy.x:.2f}, y={kp.xy.y:.2f}, conf={kp.conf:.3f})")
        print("----")

C++

// Detection (Bounding Boxes)
if (!result.boxes.empty()) {
    for (const auto& box : result.boxes) {
        // box.xyxy, box.xywh, box.conf, box.cls_id, box.cls_name
    }
}

// Segmentation (Masks as polygons)
if (!result.masks.empty()) {
    for (const auto& mask : result.masks) {
        // mask.xys (vector<Point2f>), mask.cls_id, mask.cls_name
    }
}

// Pose Estimation (Keypoints)
if (!result.keypoints.empty()) {
    for (const auto& kps : result.keypoints) {
        for (const auto& kp : kps) {
            // kp.xy.x, kp.xy.y, kp.conf
        }
    }
}

You can also check:

YOLO11 End-to-End Tutorial

Full example showing MxAccl + MxPrepost in a real-time object detection pipeline.

../../tutorials/realtime_inf/realtime_od_yolo11.html

Configuration Reference#

You can explore MxPrepost API

Common constructor arguments:

Argument	Type	Description
`accl` (required)	`mxapi.MxAccl`	`mxapi.MxAccl` runtime object used by MxPrepost for shape/task context.
`task` (required)	`str`	YOLO task string, e.g. `yolov8-det`, `yolov11-seg`, `yolov11-pose`.
`conf`	`float`	Confidence threshold (default: `0.3`).
`iou`	`float`	IoU threshold for NMS (default: `0.4`).
`classmap_path`	`str`	Path to class labels text file (one label per line).
`custom_class_labels`	`list[str]`	Override class labels directly in code.
`valid_classes`	`list[int]`	Keep only selected class IDs.
`class_agnostic`	`bool`	Enable class-agnostic NMS when `True`.
`fast_sigmoid`	`bool`	Use approximate sigmoid for slightly faster processing.
`model_id`	`int`	Select model when multiple models are compiled into one DFP.
`override_layer_mapping`	`dict[int, list[str]]`	Advanced override for output layer-to-port mapping.

For Custom YOLO Models#

Use these arguments to align MxPrepost with your custom model and dataset:

classmap_path: load class labels from a text file (one label per line).
custom_class_labels: provide labels directly in code.
valid_classes: return only selected class IDs.
model_id: choose the correct model when a DFP contains multiple models.

Note

If both classmap_path and custom_class_labels are provided, custom_class_labels takes precedence.

For custom models, ensure task (det / seg / pose) and class count are consistent with model outputs. If those are correct, auto-mapping usually works without override_layer_mapping.

prepost = mxprepost.MxPrepost(
    accl=accl,
    task="yolov11-det",
    custom_class_labels=["person", "helmet", "vest"],
    valid_classes=[0, 1],
    model_id=0,
)

Troubleshooting#

Auto Layer Detection Fails#

Use this only when auto layer detection fails and your task, class count, and model_id are already correct.

Inspect model output layer names and shapes:

dfp_inspect path_dfp_file

or using C++ code:

#include <iostream>
#include <memx/accl/MxAccl.h>

MX::Runtime::MxAccl accl{"model.dfp", {0}, {false, false}};
auto info = accl.get_model_info(0);  // use your model_id

for (size_t i = 0; i < info.output_layer_names.size(); ++i) {
    std::cout << i << " | "
              << info.output_layer_names[i] << " | "
              << info.out_featuremap_shapes[i].to_string() << "\n";
}

Build override_layer_mapping with those exact layer names:

Detection: each stride maps to [coord_layer_name, conf_layer_name].
Segmentation: each stride maps to [coord_layer_name, conf_layer_name, mask_coef_layer_name] plus 0: [mask_proto_layer_name].
Pose: each stride maps to [coord_layer_name, conf_layer_name, keypoint_layer_name].

Include all required stride entries (commonly 8, 16, 32):

# Example: explicit mapping for YOLO detection
prepost = mxprepost.MxPrepost(
    accl=accl,
    task="yolov11-det",
    override_layer_mapping={
        8:  ["/model.22/cv2.0/cv2.0.2/Conv_output_0", "/model.22/cv3.0/cv3.0.2/Conv_output_0"],
        16: ["/model.22/cv2.1/cv2.1.2/Conv_output_0", "/model.22/cv3.1/cv3.1.2/Conv_output_0"],
        32: ["/model.22/cv2.2/cv2.2.2/Conv_output_0", "/model.22/cv3.2/cv3.2.2/Conv_output_0"],
    },
)

# Segmentation additionally needs global mask proto:
# override_layer_mapping = {
#     8:  ["coord_s8",  "conf_s8",  "maskcoef_s8"],
#     16: ["coord_s16", "conf_s16", "maskcoef_s16"],
#     32: ["coord_s32", "conf_s32", "maskcoef_s32"],
#     0:  ["mask_proto_global"],
# }

Supported Tasks#

YOLOv7

Detection

YOLOv8

Detection, Segmentation, Pose

YOLOv9

Detection

YOLOv10

Detection

YOLOv11

Detection, Segmentation, Pose

Custom Support

Custom class labels and custom resolutions

Compatibility Notes#

Warning

MxPrepost is designed for MxAccl runtime flows. Legacy Python bindings such as SyncAccl, AsyncAccl, and MultistreamAsyncAccl are not supported.