YOLO Pre/Post API#

See also

For full setup, configuration, and troubleshooting, see MxPrepost Usage.

Python

class MxPrepost#

__init__(self: mxprepost.MxPrepost, accl: object, task: str, conf: SupportsFloat | SupportsIndex = 0.3, iou: SupportsFloat | SupportsIndex = 0.4, classmap_path: str = '', custom_class_labels: collections.abc.Sequence[str] = [], valid_classes: collections.abc.Sequence[SupportsInt | SupportsIndex] = [], class_agnostic: bool = False, fast_sigmoid: bool = False, model_id: SupportsInt | SupportsIndex = 0, override_layer_mapping: dict = {}) → None#

Create MxPrepost object for a specific task.

Parameters:

acclMxAccl: The MemryX accelerator object to use for post-processing. This should be an instance of the MxAccl class from the Python bindings.
taskstr: Task for post-processing. Pass yolov[7|8|9|10|11]-[det|seg|pose], e.g., “yolov10-det” for detection using YOLOv10.
conffloat, optional: Confidence score threshold for post-processing. Default is 0.3.
ioufloat, optional: Intersection over Union (IoU) threshold for post-processing. Default is 0.4.
classmap_pathstr, optional: Path to a file containing class names, with one class name per line. If not provided, COCO classes will be used by default.
custom_class_labelslist of str, optional: List of custom class labels to use instead of loading from a file. If both classmap_path and custom_class_labels are provided, custom_class_labels will take precedence.
valid_classeslist of int, optional: List of class IDs to consider during post-processing. If not provided, all classes will be considered.
fast_sigmoidbool, optional: Use fast sigmoid approximation if True. Default is False.
class_agnosticbool, optional: Use class-agnostic NMS if True. Default is False.
model_idint, optional: The ID of the model to be used from the MemryX accelerator object. Default is 0. Useful when multiple models are mapped in the same DFP.
override_layer_mappingdict, optional: A dictionary mapping stride values to lists of layer names for coord, conf, mask_coef, and keypoint ports (if applicable). This is used to override the automatic layer mapping based on output shapes, in case it fails. The keys should be stride values (e.g., 8, 16, 32) and the values should be lists of layer names corresponding to that stride. For example: {16: [“coord_layer_name”, “conf_layer_name”], 32: [“coord_layer_name”, “conf_layer_name”]}. Use stride=0 for global feature maps such as mask_proto for segmentation. Default is an empty dict, which means auto-mapping will be used.

preprocess(self: mxprepost.MxPrepost, input_image: numpy.ndarray) → numpy.ndarray#

Preprocesses the input image for model inference.

Parameters:

input_imagenp.ndarray: The input image as a numpy array.

Returns:

np.ndarray: The preprocessed image ready for model input.

postprocess(*args, **kwargs)#

Overloaded function.

postprocess(self: mxprepost.MxPrepost, ofmaps: collections.abc.Sequence[numpy.ndarray], ori_h: typing.SupportsInt | typing.SupportsIndex, ori_w: typing.SupportsInt | typing.SupportsIndex) -> mxprepost.Result
postprocess(self: mxprepost.MxPrepost, ofmaps: collections.abc.Sequence[numpy.ndarray], original_image: numpy.ndarray) -> mxprepost.Result

Postprocesses the output feature maps from the model inference and the original image shape

Parameters:

ofmapsList[np.ndarray]: A list of output feature maps from the model inference, as numpy arrays.
ori_hint (function overload 1 only): The original height of the input image. Not the model’s height – use the stream’s original resolution.
ori_wint (function overload 1 only): The original width of the input image. Not the model’s width – use the stream’s original resolution.
original_imagenp.ndarray (function overload 2 only): The original input image as a numpy array. This is used to provide the original shape context for post-processing only.

Returns:

Result: The post-processing result containing boxes, masks, keypoints, etc.

draw(self: mxprepost.MxPrepost, image: numpy.ndarray, result: mxprepost.Result) → numpy.ndarray#

Draws the post-processing results on the input image.

Parameters:

imagenp.ndarray: The input image as a numpy array.
resultResult: The post-processing result containing boxes, masks, keypoints, etc., to be drawn on the image.

Returns:

np.ndarray: The image with the post-processing results drawn on it, as a numpy array.

C++

class MxPrepost#

Class for MX pre/post-processing. This class defines the interface for preprocessing input images, postprocessing model outputs, and drawing results on images.

You should use the factory functions to create task-specific pre/post-processing objects, and assign to a pointer of this base class type. For example:

std::unique_ptr<MxPrepost> prepost = std::make_unique<MxPrepost>();
try {
   prepost.reset(MxPrepost::create(accl, "yolov8-det", config));
} catch (const UnsupportedTaskError& e) {
   std::cerr << "Error creating MxPrepost: " << e.what() << std::endl;
}

Or using the safe factory function:

std::unique_ptr<MxPrepost> prepost;
std::string err;
if (!MxPrepost::create_safe(accl, "yolov8-det", config, prepost, err)) {
  std::cerr << "Error creating MxPrepost: " << err << std::endl;
} else {
  // prepost is successfully created and can be used
}

Public Functions

virtual cv::Mat preprocess(const cv::Mat &input) = 0#

Preprocess input image for model inference.

Parameters:: input – Input image as cv::Mat (in RGB format).
Returns:: Preprocessed image as cv::Mat, ready to be fed into the MXA.

virtual void postprocess(const std::vector<float*> &outputs, Result &result, int ori_w, int ori_h) = 0#

Postprocess the output feature maps from the MXA and use original image dimensions for scaling.

Parameters:

outputs – Vector of pointers to float arrays, each being an output feature map from the MXA (and after a call to FeatureMap::get_data()).
result – Output Result object to be filled with post-processing results (e.g., detected boxes, masks, keypoints).
ori_w – Original width of the input image before preprocessing (used for scaling post-processing results back to original image space).
ori_h – Original height of the input image before preprocessing (used for scaling post-processing results back to original image space).

virtual void postprocess(const std::vector<float*> &outputs, Result &result, const cv::Mat &original_image) = 0#

Postporcess the output feature maps from the MXA and get the original image’s dimensions from the original image itself (instead of passing ori_w and ori_h separately).

Parameters:

outputs – Vector of pointers to float arrays, each being an output feature map from the MXA (and after a call to FeatureMap::get_data()).
result – Output Result object to be filled with post-processing results (e.g., detected boxes, masks, keypoints).
original_image – The original input image as a cv::Mat (in RGB format). This is used to obtain the original dimensions for scaling post-processing results back to original image space.

virtual void draw(cv::Mat &image, const Result &result) = 0#

Draw the post-processing Result data on the given image (e.g., draw detected boxes, masks, keypoints, etc.).

Parameters:

image – The image (in BGR format) on which to draw the post-processing results. This will be modified in-place.
result – The Result object containing post-processing results to be drawn on the image.

Public Static Functions

static MxPrepost *create(MX::Runtime::MxAcclBase *accl, const std::string &task, const YoloUserConfig &config)#

Factory function to create a pre/post-processing object for the given task.

Parameters:

accl – Accelerator runtime object: MxAccl or MxAcclMT
task – Task string like “yolov8-det”.
config – User config.

Throws:

UnsupportedTaskError – if task is not recognized.

Returns:

Raw pointer owned by caller (or wrap into std::unique_ptr).

static bool create_safe(MX::Runtime::MxAcclBase *accl, const std::string &task, const YoloUserConfig &config, std::unique_ptr<MxPrepost> &out, std::string &err) noexcept#

No-throw factory function to create a pre/post-processing object for the given task.

Parameters:

accl – Accelerator runtime object: MxAccl or MxAcclMT
task – Task string like “yolov8-det”.
config – User config.
out – Output unique_ptr to hold the created MxPrepost object if successful.
err – Output string to hold error message if creation fails.

Returns:

true if creation is successful, false otherwise. If false is returned, ‘out’ will be set to nullptr and ‘err’ will contain the error message.

Python

class Result#

Result class for post-processing output, containing detected bounding boxes, masks, and/or keypoints.

Attributes:

boxesList[Box]: List of detected bounding boxes. Each box is represented as a Box object containing coordinates, confidence score, class ID, and class name.
masksList[Mask]: List of detected masks. Each mask is represented as a Mask object containing the polygon coordinates, class ID, and class name.
keypointsList[Keypoint]: List of detected keypoints. Each keypoint is represented as a Keypoint object containing the (x, y) coordinates and confidence score.

C++

struct Result#

Struct representing the post-processing result for a single input image, containing detected bounding boxes, masks, and/or keypoints.

Public Members

std::vector<BBox> boxes#: List of detected bounding boxes in the input image, each represented as a BBox struct.

std::vector<Mask> masks#: List of detected segmentation masks in the input image, each represented as a Mask struct.

std::vector<std::vector<Keypoint>> keypoints#: List of detected keypoints for pose estimation in the input image. Each element in the outer vector corresponds to a detected instance (e.g., a person), and contains a vector of Keypoint structs representing the keypoints for that instance.

Python

class Box#

Box class for representing detected bounding boxes.

Attributes:

xywhTuple[float, float, float, float]: Bounding box represented as (x_center, y_center, width, height).
xyxyTuple[float, float, float, float]: Bounding box represented as (x_min, y_min, x_max, y_max).
conffloat: Confidence score of the detected bounding box.
cls_idint: Class ID of the detected object.
cls_namestr: Class name as string of the detected object.

C++

struct BBox#

Struct representing a bounding box detection result, with multiple representations (xyxy and xywh) for convenience.

Public Members

std::array<float, 4> xyxy#: (x_min, y_min, x_max, y_max)

std::array<float, 4> xywh#: (x_center, y_center, width, height)

float conf = 0.0f#: confidence score

int cls_id = -1#: class index

std::string cls_name#: class name as string

Python

class Mask#

Mask class for representing segmentation masks.

Attributes:

xysList[Point2f]: List of (x, y) coordinates representing the polygon of the detected mask. Each coordinate is a Point2f object.
cls_idint: Class ID of the detected mask.
cls_namestr: Class name as string of the detected mask.

C++

struct Mask#

Struct representing a detected segmentation mask, defined by a polygon of (x, y) coordinates and associated class information.

Public Members

std::vector<Point2f> xys#: List of (x, y) coordinates representing the polygon of the detected mask. Each coordinate is a Point2f object.

int cls_id#: Class ID of the detected mask.

std::string cls_name#: Class name as string of the detected mask.

Python

class Keypoint#

Keypoint class for representing pose estimation keypoints.

Attributes:

xyPoint2f: (x, y) coordinates of the pose keypoint as a Point2f object.
conffloat: Confidence score of the detected pose keypoint.

C++

struct Keypoint#

Struct representing a detected keypoint for pose estimation, defined by its (x, y) coordinates and confidence score.

Public Members

Point2f xy#: (x, y) coordinates of the pose keypoint

float conf#: Confidence score of the detected pose keypoint.

Python

class Point2f#

Point2f class for representing a point with (x, y) coordinates as floats.

Attributes:

xfloat: X coordinate of the point.
yfloat: Y coordinate of the point.

C++

struct Point2f#

Struct representing a point with floating-point coordinates, used for mask polygon and pose keypoints.

Public Members

float x#: X coordinate of the point.

float y#: Y coordinate of the point.