Python Benchmark#
The mx_bench command line interface benchmark tool provides an easy way to measure the FPS and latency for models.
After successfull installation of SDK and runtime libraries, the accelerator CLI can be run using mx_bench
command.
mx_bench -h
usage: mx_bench [-h] [-v] [-d] [-f]
Note
The python benchmark tool and API (Python benchmark API) provides a simple way to characterize the neural network model performance on the MX3 chips. For integration, it’s advised to use the Accelerator APIs.
Usage#
The benchmark tool requires a compiled DFP, which is generated by the Neural Compiler. For a quick start using the neural compiler, please refer to Hello, Mobilenet!.
Arguments#
Option |
Description |
---|---|
-h, --help |
show this help message and exit |
-v, --verbose |
set the level of verbose messaging |
-hello |
identify the connected system |
-d, --dfp |
select which .dfp file to use (default: model.dfp) |
-f, --frames |
Set the number of frames to run (default: 100) |
Note
If a DFP file is not specified,
mx_bench
will default to usemodel.dfp
in the current running directory.
Identify#
Start by checking the health of MXA connection and get some basic info about the benchmark.
mx_bench --hello
Hello from MXA!
Group: 0
Number of chips: 4
Interface: PCIe 3.0
Device Map:
/dev/memx0
Warning
C++ API performs load balancing if the given DFP is compiled for 2 chip. Load Balancing feature will make use of the remaining 2 chips on a 4 chip MXA to enhance the performance for 2 chip models, this might show some discrepancy between mx_bench tool and acclBench tool FPS numbers.
Benchmark with random input data#
You can get a quick estimate of the FPS and latency of your model by letting the Benchmark use randomly generated data (of the correct size) to run inference. This saves you the hassel of generating feature maps for the accelerator to consume. It is recommend to set the frame number over 500.
mx_bench -v -d model.dfp -f 500
Benchmark with real input data#
This tool is intended for rapidly conducting inference using an MXA and benchmarking the performance. For integration of the MXA with actual images or video streams, please consult the Accelerator APIs.
See also