Shared Mode#

The Accelerator C++ API has two modes of operation:

Local Mode: only 1 application process can use MXA hardware at a time (default)
Shared Mode: multiple processes can ‘share’ MXA hardware simultaneously

This tutorial explains the differences, then shows examples on how to use Shared Mode.

Note

The Python API only supports Local Mode.

Important

Remember: Local Mode is the default, and was the only option in all previous SDK releases. Don’t switch to Shared Mode unless you really need it!

Introduction#

For the following discussion, it’s important to understand the differences between processes and threads. Please see resources such as here if you need a refresher.

Local Mode: Single Process (Default)#

By default, the C++ runtime assumes situations where a single process will use an MXA, preferably using multiple threads for parallelism. However, if multiple processes attempt to use the same MXA at the same time, only the first one to connect will get access, which it will hold exclusively until it exits.

This default behavior is referred to as “Local Mode”, because the application a direct local connection to the MXA with no mediator.

Shared Mode: Multiple Processes#

Starting with SDK 1.1, multiple processes may now use the same MXA via “Shared Mode”, in which a server process acts as an intermediary between clients and the MXA hardware. The client’s use of the Accelerator API is unchanged, with both Auto-Threading and Manual-Threading supported.

Shared Mode Usage#

Important

Shared Mode functionality is currently in a Feature Preview state. The APIs will not change much going forward, but there are internal optimizations that are still being made. On most systems, we do not currently recommend exceeding 4 client applications at a time.

Switching your application from Local to Shared is extremely simple. When the MxAccl or MxAcclMT object is created, simply set the use_shared_mode flag to true.

For example:

MxAccl *accl = new MxAccl(true);
accl->connect_dfp("my_model.dfp");
...

Congrats! You can now run multiple processes that use this DFP simultaneously!

Explicity Set Address (optional)#

There is an additional, optional argument: server_ip. By default, the MxAccl constructor will attempt to connect to the mx_server process running on the localhost interface, 127.0.0.1. In some use cases, including Docker, you may need to set a different address.

For example, to use the IP address 192.168.1.100, simply:

MxAccl *accl = new MxAccl(true, "192.168.1.100");
accl->connect_dfp("my_model.dfp");
...

Multi-Device Load Balancing#

As with Local mode, you can still use automatic duplication of a DFP across multiple M.2 cards by passing a list of devices to the connect_dfp function.

For example, to use 2 MXAs:

MxAccl *accl = new MxAccl(true);
accl->connect_dfp("my_model.dfp", {0,1});
...

Considerations#

Shared mode has a few important considerations that users must be aware of.

All Clients Must Use The Same DFP: Client processes can only run in parallel if they are attempting to use the same DFP. Whether clients request the “same DFP” is determined via hash+checksum of DFP contents, so the filenames do not need to match. If a client comes along and requests a new DFP, the server will reply that the MXA is occupied and reject the request. Once all clients for a given DFP have stopped, the MXA is free again for a new DFP.
Shared And Local Are Mutually Exclusive: If a process is currently running in Local mode, all requests for Shared mode, even if the DFP is the same, will be rejected by the server. The Local mode process must exit before the MXA can be used in Shared mode.
Lower FPS, Higher Latency: Since data is sent to/from the mx_server process using network packets, Shared mode can sometimes be slower than Local mode for the same DFP. This can vary by model, as some may have very little difference while others (typically high FPS models) can have a significant drop. We are working to improve performance for these scenarios in future SDKs.