CenterNet Object Detection#
Introduction#
In this tutorial, we will demonstrate how to use the Acclerator C++ API to perform object detection with CenterNet on the MX3. We will use the centernet_mobilenetv2_fpn_kpts
model for our demo. The goal of this tutorial is to demonstrate the end-to-end inference capability of the API in C++, including how to connect any pre-processing and/or post-processing that may have been cropped.
Background#
Some models, like the centernet model in this tutorial, have layers at the beginning and end that are not supported natively on MX3 hardware. The neural compiler’s model cropping functionality handles this. More details are available in the Model Cropping tutorial.
Both the Python and C++ APIs support connecting pre and post models into the accelerator runtime object so that you don’t have to create and manage additional runtimes.
Note that not all models have both pre and post models; for example, the YoloV7 model in the Object Detection tutorial only has a post model.
Note
This tutorial assumes a four-chip solution is correctly connected.
C++ users will have to install the tflite library from source. Refer thirdparty libraries for installation steps.
Download & Run
Download
This tutorial provides a high-level overview of the application’s key components. To run the full application, download the complete code package and the compiled DFP. After downloading, refer to the Run section below for step-by-step instructions.
Run
Install Requirements
Before running the application, ensure the following dependencies are installed:
sudo apt install qtbase5-dev
Run Command
To run the C++ example for object detection with Centernet using MX3, simply execute the following steps:
Build the project using CMake. From the project directory, execute:
# ensure a camera device is connected as default video input is a cam
cd src/cpp
mkdir build && cd build
cmake ..
make -j
Run the application
To run the application using the default DFP file and a camera as input, use the following command:
./CenterNet onnx cam:<cam index>
./CenterNet tf cam:<cam index>
./CenterNet tflite cam:<cam index>
To run the application using the default DFP file and a video file as input, use the following command:
./CenterNet onnx vid:<video file>
./CenterNet tf vid:<video file>
./CenterNet tflite vid:<video file>
1. Download the Model#
The CenterNet pre-trained models are available on the TensorFlow Centernet GitHub page. For convenience, we have provided the exported and compiled models in the following compressed folder attached
to this tutorial.
2. Compile the Model#
CenterNet needs to be compiled with the autocrop
flag/argument which generates a DFP file for the main section of the model (centernet_onnx.dfp
), the pre-processing model (centernet_pre.onnx
) and the post-processing model (centernet_post.onnx
). The compilation step is typically needed once and can be done using the Neural Compiler API or Tool.
from memryx import NeuralCompiler
nc = NeuralCompiler(num_chips=4, models="centernet.onnx", verbose=1, dfp_fname = "centernet_onnx", autocrop=True)
dfp = nc.run()
In your command line, you need to type,
mx_nc -v -m centernet.onnx --autocrop -c 4 --dfp_fname centernet_onnx
In your C++ code, you need to point the dfp
via a generated file path,
fs::path onnx_model_path = "centernet_onnx.dfp";
Note
The code above uses the onnx version of the model for compilation, but you can also pass in the tflite or tensorflow versions.
3. Pipelines#
In this tutorial OpenCV is used for image loading, image procesing and display. The following flowchart shows the different parts of the pipeline. Note that the input camera frame should be saved (queued) to be later overlayed and displayed.
4. CV Initializations#
First, we import the required libraries, initialize the CV pipeline, and define common variables.
#include "memx/accl/MxAccl.h"
#include <signal.h>
#include <iostream>
#include <opencv2/opencv.hpp> /* imshow */
#include <opencv2/imgproc.hpp> /* cvtcolor */
#include <opencv2/imgcodecs.hpp> /* imwrite */
#include <chrono>
#include <memx/mxutils/gui_view.h>
if(video_src.substr(0,3) == "cam"){
src_is_cam = true;
#ifdef __linux__
if (!openCamera(vcap, video_src[4]-'0', cv::CAP_V4L2)) {
throw(std::runtime_error("Failed to open: "+video_src));
}
#elif defined(_WIN32)
if (!openCamera(vcap, video_src[4]-'0', cv::CAP_ANY)) {
throw(std::runtime_error("Failed to open: "+video_src));
}
#endif
}
else if(video_src.substr(0,3) == "vid"){
vcap.open(video_src.substr(4),cv::CAP_ANY);
src_is_cam = false;
}
5. Define an Input Function#
We need to define an input function for the accelerator which will get a new frame from the cam and pre-process it.
Note
This is the not the same as cropped pre processing discussed before. This section refers to the pre preocessing needs to be done on the image. In this example pre-preocessing referes to image loading, resizing and norimalization.
bool incallback_getframe(vector<const MX::Types::FeatureMap*> dst, int streamLabel){
if(runflag.load()){
cv::Mat inframe;
cv::Mat rgbImage;
bool got_frame = vcap.read(inframe);
if (!got_frame) {
std::cout << "No frame \n\n\n";
return false; // return false if frame retrieval fails
}
{
std::lock_guard<std::mutex> ilock(frame_queue_mutex);
cv::cvtColor(inframe, rgbImage, cv::COLOR_BGR2RGB);
frames_queue.push_back(rgbImage);
}
// Preprocess frame
cv::Mat preProcframe = preprocess(rgbImage);
if(type_ == App_Onnx){
// For ONNX models, we need to convert the image to CHW format
cv::Mat chwImage;
cv::dnn::blobFromImage(preProcframe, chwImage, 1.0, cv::Size(model_input_width, model_input_height), cv::Scalar(0, 0, 0), true, false);
preProcframe = chwImage;
}
dst[0]->set_data((float*)preProcframe.data);
return true;
}
else{
vcap.release();
return false;
}
}
Note
In the above code, method preprocess is used as pre-processing step. This method can be found as a part of the full code file.
6. Define Output Functions#
We also need to define an out function for the accelerator to use. Our output function will post-process the accelerator output and display it on the screen.
Note
This is the not the same as cropped post processing discussed before. This section refers to the post preocessing needs to be done on the image. In this example post-preocessing referes to decoding output, drawing boxes and displaying the image.
The output function will also overlay and display the output frame besides the MXA data collection and post-processing.
bool outcallback_getmxaoutput(vector<const MX::Types::FeatureMap*> src, int streamLabel){
for(int i =0; i<src.size();++i){
src[i]->get_data(output[i]);
}
{
std::lock_guard<std::mutex> ilock(frame_queue_mutex);
// pop from frame queue
displayImage = frames_queue.front();
frames_queue.pop_front();
}// releases in frame queue lock
//Get the detections from model output
num_boxes = output[outmap_.num_boxes_idx][0];
//printf("num_boxes: %d\n", num_boxes);
std::vector<detectedObj> detected_objectVector = get_detections(output);
// draw boundign boxes
draw_bounding_box(displayImage, detected_objectVector );
// using mx QT util to update the display frame
gui_->screens[0]->SetDisplayFrame(streamLabel,&displayImage,fps_number);
//Calulate FPS once every AVG_FPS_CALC_FRAME_COUNT frames
frame_count++;
if (frame_count == 1)
{
start_ms = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch());
}
else if (frame_count % AVG_FPS_CALC_FRAME_COUNT == 0)
{
std::chrono::milliseconds duration =
std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch()) - start_ms;
fps_number = (float)AVG_FPS_CALC_FRAME_COUNT * 1000 / (float)(duration.count());
frame_count = 0;
}
return true;
}
7. Connect the Accelerator#
The main() function Creates the accelerator, CenterNet object and starts the acceleartor and waits for it to finish
//Create the Accl object and load the DFP
accl = new MX::Runtime::MxAccl(onnx_model_path.c_str(), {0}, {true,true}, false, {20, 0, false, 12, 12}, {false, 0});
//Connecting the pre-processing and post-processing models
accl->connect_pre_model(onnx_preprocessing_model_path,0);
accl->connect_post_model(onnx_postprocessing_model_path,0);
//Creating a CenterNet object for each stream which also connects the corresponding stream to accl.
CenterNet* obj;
if(plugin_name=="onnx"){
obj = new CenterNet(accl,video_src,&gui,App_Onnx);
}
else if (plugin_name=="tf"){
obj = new CenterNet(accl,video_src,&gui,App_Tf);
}
else{
obj = new CenterNet(accl,video_src,&gui,App_Tflite);
}
//Run the accelerator and wait
accl->start();
gui.Run(); //This command waits for exit to be pressed in Qt window
accl->stop();
The CenterNet() constructor connects the input stream to the accelartor.
auto in_cb = std::bind(&CenterNet::incallback_getframe, this, std::placeholders::_1, std::placeholders::_2);
auto out_cb = std::bind(&CenterNet::outcallback_getmxaoutput, this, std::placeholders::_1, std::placeholders::_2);
accl->connect_stream(in_cb, out_cb, 0, 0);
8. How to Use#
Users can download the attached zip file and compile the application with cmake. This will result in an executable, CenterNet
. The following commands are supposed to be run in a terminal in the same directory as the executable.
Default run, starts the application with onnx models and uses a pre-stored video file,
./CenterNet
Users can specify their desired model library to start the application with that library to run th application on a pre-stored video file,
./CenterNet tflite
./CenterNet tf
Users can specify their desired model library and desired input to the application,
./CenterNet onnx vid:<path to video file>
./CenterNet tflite cam:<camera index>
9. Summary#
This tutorial showed how to use the Accelerator C++ API to run a inference using an centernet model. The code and the resources used in the tutorial are available to download: