YOLO Pre/Post API#
See also
For full setup, configuration, and troubleshooting, see MxPrepost Usage.
- class MxPrepost#
- __init__(self: mxprepost.MxPrepost, accl: object, task: str, conf: SupportsFloat | SupportsIndex = 0.3, iou: SupportsFloat | SupportsIndex = 0.4, classmap_path: str = '', custom_class_labels: collections.abc.Sequence[str] = [], valid_classes: collections.abc.Sequence[SupportsInt | SupportsIndex] = [], class_agnostic: bool = False, fast_sigmoid: bool = False, model_id: SupportsInt | SupportsIndex = 0, override_layer_mapping: dict = {}) None#
Create MxPrepost object for a specific task.
- Parameters:
- acclMxAccl
The MemryX accelerator object to use for post-processing. This should be an instance of the MxAccl class from the Python bindings.
- taskstr
Task for post-processing. Pass yolov[7|8|9|10|11]-[det|seg|pose], e.g., “yolov10-det” for detection using YOLOv10.
- conffloat, optional
Confidence score threshold for post-processing. Default is 0.3.
- ioufloat, optional
Intersection over Union (IoU) threshold for post-processing. Default is 0.4.
- classmap_pathstr, optional
Path to a file containing class names, with one class name per line. If not provided, COCO classes will be used by default.
- custom_class_labelslist of str, optional
List of custom class labels to use instead of loading from a file. If both classmap_path and custom_class_labels are provided, custom_class_labels will take precedence.
- valid_classeslist of int, optional
List of class IDs to consider during post-processing. If not provided, all classes will be considered.
- fast_sigmoidbool, optional
Use fast sigmoid approximation if True. Default is False.
- class_agnosticbool, optional
Use class-agnostic NMS if True. Default is False.
- model_idint, optional
The ID of the model to be used from the MemryX accelerator object. Default is 0. Useful when multiple models are mapped in the same DFP.
- override_layer_mappingdict, optional
A dictionary mapping stride values to lists of layer names for coord, conf, mask_coef, and keypoint ports (if applicable). This is used to override the automatic layer mapping based on output shapes, in case it fails. The keys should be stride values (e.g., 8, 16, 32) and the values should be lists of layer names corresponding to that stride. For example: {16: [“coord_layer_name”, “conf_layer_name”], 32: [“coord_layer_name”, “conf_layer_name”]}. Use stride=0 for global feature maps such as mask_proto for segmentation. Default is an empty dict, which means auto-mapping will be used.
- preprocess(self: mxprepost.MxPrepost, input_image: numpy.ndarray) numpy.ndarray#
Preprocesses the input image for model inference.
- Parameters:
- input_imagenp.ndarray
The input image as a numpy array.
- Returns:
- np.ndarray
The preprocessed image ready for model input.
- postprocess(*args, **kwargs)#
Overloaded function.
postprocess(self: mxprepost.MxPrepost, ofmaps: collections.abc.Sequence[numpy.ndarray], ori_h: typing.SupportsInt | typing.SupportsIndex, ori_w: typing.SupportsInt | typing.SupportsIndex) -> mxprepost.Result
postprocess(self: mxprepost.MxPrepost, ofmaps: collections.abc.Sequence[numpy.ndarray], original_image: numpy.ndarray) -> mxprepost.Result
Postprocesses the output feature maps from the model inference and the original image shape
- Parameters:
- ofmapsList[np.ndarray]
A list of output feature maps from the model inference, as numpy arrays.
- ori_hint (function overload 1 only)
The original height of the input image. Not the model’s height – use the stream’s original resolution.
- ori_wint (function overload 1 only)
The original width of the input image. Not the model’s width – use the stream’s original resolution.
- original_imagenp.ndarray (function overload 2 only)
The original input image as a numpy array. This is used to provide the original shape context for post-processing only.
- Returns:
- Result
The post-processing result containing boxes, masks, keypoints, etc.
- draw(self: mxprepost.MxPrepost, image: numpy.ndarray, result: mxprepost.Result) numpy.ndarray#
Draws the post-processing results on the input image.
- Parameters:
- imagenp.ndarray
The input image as a numpy array.
- resultResult
The post-processing result containing boxes, masks, keypoints, etc., to be drawn on the image.
- Returns:
- np.ndarray
The image with the post-processing results drawn on it, as a numpy array.
-
class MxPrepost#
Class for MX pre/post-processing. This class defines the interface for preprocessing input images, postprocessing model outputs, and drawing results on images.
You should use the factory functions to create task-specific pre/post-processing objects, and assign to a pointer of this base class type. For example:
std::unique_ptr<MxPrepost> prepost = std::make_unique<MxPrepost>(); try { prepost.reset(MxPrepost::create(accl, "yolov8-det", config)); } catch (const UnsupportedTaskError& e) { std::cerr << "Error creating MxPrepost: " << e.what() << std::endl; }
Or using the safe factory function:
std::unique_ptr<MxPrepost> prepost; std::string err; if (!MxPrepost::create_safe(accl, "yolov8-det", config, prepost, err)) { std::cerr << "Error creating MxPrepost: " << err << std::endl; } else { // prepost is successfully created and can be used }
Public Functions
-
virtual cv::Mat preprocess(const cv::Mat &input) = 0#
Preprocess input image for model inference.
- Parameters:
input – Input image as cv::Mat (in RGB format).
- Returns:
Preprocessed image as cv::Mat, ready to be fed into the MXA.
-
virtual void postprocess(const std::vector<float*> &outputs, Result &result, int ori_w, int ori_h) = 0#
Postprocess the output feature maps from the MXA and use original image dimensions for scaling.
- Parameters:
outputs – Vector of pointers to float arrays, each being an output feature map from the MXA (and after a call to FeatureMap::get_data()).
result – Output Result object to be filled with post-processing results (e.g., detected boxes, masks, keypoints).
ori_w – Original width of the input image before preprocessing (used for scaling post-processing results back to original image space).
ori_h – Original height of the input image before preprocessing (used for scaling post-processing results back to original image space).
-
virtual void postprocess(const std::vector<float*> &outputs, Result &result, const cv::Mat &original_image) = 0#
Postporcess the output feature maps from the MXA and get the original image’s dimensions from the original image itself (instead of passing ori_w and ori_h separately).
- Parameters:
outputs – Vector of pointers to float arrays, each being an output feature map from the MXA (and after a call to FeatureMap::get_data()).
result – Output Result object to be filled with post-processing results (e.g., detected boxes, masks, keypoints).
original_image – The original input image as a cv::Mat (in RGB format). This is used to obtain the original dimensions for scaling post-processing results back to original image space.
-
virtual void draw(cv::Mat &image, const Result &result) = 0#
Draw the post-processing Result data on the given image (e.g., draw detected boxes, masks, keypoints, etc.).
- Parameters:
image – The image (in BGR format) on which to draw the post-processing results. This will be modified in-place.
result – The Result object containing post-processing results to be drawn on the image.
Public Static Functions
-
static MxPrepost *create(MX::Runtime::MxAcclBase *accl, const std::string &task, const YoloUserConfig &config)#
Factory function to create a pre/post-processing object for the given task.
- Parameters:
accl – Accelerator runtime object: MxAccl or MxAcclMT
task – Task string like “yolov8-det”.
config – User config.
- Throws:
UnsupportedTaskError – if task is not recognized.
- Returns:
Raw pointer owned by caller (or wrap into std::unique_ptr).
-
static bool create_safe(MX::Runtime::MxAcclBase *accl, const std::string &task, const YoloUserConfig &config, std::unique_ptr<MxPrepost> &out, std::string &err) noexcept#
No-throw factory function to create a pre/post-processing object for the given task.
- Parameters:
accl – Accelerator runtime object: MxAccl or MxAcclMT
task – Task string like “yolov8-det”.
config – User config.
out – Output unique_ptr to hold the created MxPrepost object if successful.
err – Output string to hold error message if creation fails.
- Returns:
true if creation is successful, false otherwise. If false is returned, ‘out’ will be set to nullptr and ‘err’ will contain the error message.
-
virtual cv::Mat preprocess(const cv::Mat &input) = 0#
- class Result#
Result class for post-processing output, containing detected bounding boxes, masks, and/or keypoints.
- Attributes:
- boxesList[Box]
List of detected bounding boxes. Each box is represented as a Box object containing coordinates, confidence score, class ID, and class name.
- masksList[Mask]
List of detected masks. Each mask is represented as a Mask object containing the polygon coordinates, class ID, and class name.
- keypointsList[Keypoint]
List of detected keypoints. Each keypoint is represented as a Keypoint object containing the (x, y) coordinates and confidence score.
-
struct Result#
Struct representing the post-processing result for a single input image, containing detected bounding boxes, masks, and/or keypoints.
- class Box#
Box class for representing detected bounding boxes.
- Attributes:
- xywhTuple[float, float, float, float]
Bounding box represented as (x_center, y_center, width, height).
- xyxyTuple[float, float, float, float]
Bounding box represented as (x_min, y_min, x_max, y_max).
- conffloat
Confidence score of the detected bounding box.
- cls_idint
Class ID of the detected object.
- cls_namestr
Class name as string of the detected object.
-
struct BBox#
Struct representing a bounding box detection result, with multiple representations (xyxy and xywh) for convenience.
- class Mask#
Mask class for representing segmentation masks.
- Attributes:
- xysList[Point2f]
List of (x, y) coordinates representing the polygon of the detected mask. Each coordinate is a Point2f object.
- cls_idint
Class ID of the detected mask.
- cls_namestr
Class name as string of the detected mask.
-
struct Mask#
Struct representing a detected segmentation mask, defined by a polygon of (x, y) coordinates and associated class information.
- class Keypoint#
Keypoint class for representing pose estimation keypoints.
- Attributes:
- xyPoint2f
(x, y) coordinates of the pose keypoint as a Point2f object.
- conffloat
Confidence score of the detected pose keypoint.
- class Point2f#
Point2f class for representing a point with (x, y) coordinates as floats.
- Attributes:
- xfloat
X coordinate of the point.
- yfloat
Y coordinate of the point.