Neural Compiler#
- class NeuralCompiler(models=None, num_chips='4', input_shapes=[], effort='normal', inputs=None, outputs=None, model_in_out=None, autocrop=False, target_fps='max', dfp_fname=None, no_sim_dfp=False, verbose=0, show_optimization=False, hpoc=None, hpoc_file=None, wbtable=None, exp_auto_dp=False, *args, **kwargs)#
The MemryX Neural Compiler.
The Neural Compiler (NC) attempts to transform a Neural Network model into a Dataflow Program (DFP) which can be programmed and run on MXAs.
- Parameters:
- modelsmodel or list of models
Neural Network model(s) to compile. Can be a path to a model, a loaded NN model, or a list of models. The different frameworks expect these file extensions or model types:
Keras:
.h5 .json keras.Model
TensorFlow:
.pb .meta
TF-Lite:
.tflite
ONNX:
.onnx
- num_chipsint or string
Number of MXAs, ‘min’ for minimum required MXAs
- effortstring
Set the compiler’s optimization effort. Select from [lazy, normal, hard].
Lazy: compile very quickly, but with low inference performance (FPS).
Normal (default): strikes a good balance between compile time and inference performance.
Hard: will get the best inference performance, but will greatly increase time to compile.
- target_fpsfloat or string
Sets the target FPS for the cores optimizer, defaults to max.
- autocrop: bool
Automatically crop the pre/post-processing layers from the input model(s).
- inputsstring
string specifying the names of the input layers of the model(s). For multi-input, delimit inputs with , symbol. For multi-models, separate inputs with | symbol. This argument overrides –model_in_out.
- outputsstring
string specifying the names of the output layers of the model(s). For multi-output, delimit outputs with , symbol. For multi-models, separate inputs with | symbol. This argument overrides –model_in_out.
- model_in_outstring (path)
JSON file that contains the names of the input and output layers. Use this to extract a subgraph from the full graph. For example, to remove pre/post processing.
- dfp_fnamestring
file path location to save dfp. Will not save if unspecified.
- no_sim_dfpbool
Skip including Simulator info in the .dfp file. Useful for making smaller files for hardware-only deployments.
- verboseint
Controls how verbose the NeuralCompiler is.
- show_optimizationbool
Animates mapper optimization steps.
- input_shapeslist of strings
This is only needed if the model input shapes cannot be inferred from the model itself. Specify the batchless input shape for each model. One string per model, with comma delimited shapes. If a model has multiple inputs, separate the shapes with a space. “AUTO” can be used for models that do not need specified shapes.
- Examples:
input_shapes=["300,300,3"] # 1 model with 1 input
input_shapes=["300,300,3 400,400,3"] # 1 model with 2 inputs
input_shapes=["300,300,3 400,400,3", "200,200,3"] # 2 models with 2 and 1 inputs respectively
input_shapes=["300,300,3", "AUTO", "224,224,3"] # 3 models (only model 1 and 3 need shapes)
- hpoc: string of ints
List of final layer output channels to increase precision. If HPOC is specified the model must have a single output OR each output must share the same number of OUTPUT CHANNELS. If more flexibility is required (multi-model and/or multi-output) use hpoc_file argument
- Example:
hpoc="0 1 2 3 4"
- hpoc_file: path to json
JSON file used to increase precison for the specified output channels. For more details see formats.
- wbtable: path to json
Optional JSON file with per-layer weight quantization information. Valid precision values are 4, 8, and 16-bit. You can set global precision (i.e, for ALL layers) with the help of __DEFAULT__ JSON tag. For more details see formats.
- exp_auto_dp: bool
EXPERIMENTAL auto double precision. Uses 16 bit weights on auto-selected layers. Other layers use 8 bit weights. NOTE cannot be combined with wbtable.
- run()#
Run the Neural Compiler.
Runs the Neural Compiler with the current configuration. It will convert the model(s) into a single dataflow program which can be used to program MXAs or simulated using the Simulator.
Note
The
models
arg must be configured before calling run().- Returns:
- dfpDfp object
See DFP for more details.
- Raises:
- CompilerError, OperatorError, ResourceError
Examples
from tensorflow import keras mobilenet = keras.applications.MobileNet() resnet50 = keras.applications.ResNet50() from memryx import NeuralCompiler # Compile MobileNet to 1 chip nc = NeuralCompiler(models=mobilenet, num_chips=1) dfp = nc.run() # Compile ResNet50 to 4 chips dfp = NeuralCompiler(models=resnet50, num_chips=4).run() # Compile MobileNet+ResNet50 to 4 chips dfp = NeuralCompiler(models=[mobilenet,resnet50], num_chips=4).run() # Compile Mobilenet but crop with inputs/outputs argument inputs = mobilenet.layers[3].name outputs = mobilenet.layers[5].name nc = NeuralCompiler(models=mobilenet, inputs=inputs, outputs=outputs).run()
- set_config(**kwargs)#
Configure the Neural Compiler.
Configure the Neural Compiler with the keyword arguments listed in the __init__ function.
- Parameters:
- **kwargs
Keyword args used to configure the Neural Compiler.
Examples
nc = NeuralCompiler() nc.set_config(num_chips=4, chip_gen="mx3") nc.set_config(model=mobilenet)
- reset_config()#
Reset config.
Reset configuration to the default values that the Neural Compiler was configured with.
Examples
nc = NeuralCompiler() nc.set_config(effort='hard') print(nc.get_config()['effort']) # Config reset nc.reset_config() print(nc.get_config()['effort'])
outputs:
>> 'hard' >> 'normal'
- get_config()#
Return the current config.
Get a dictionary of current Neural Compiler configuration.
- Returns:
- Configdict
Dictionary of Neural Compiler configurations.
Examples
nc = NeuralCompiler() print(nc.get_config())
outputs:
{'models': [None], 'num_chips': 4, 'input_shapes': [], 'effort': 'normal', 'inputs': None, 'outputs': None, 'model_in_out': None, 'autocrop': False, 'target_fps': 'max', 'dfp_fname': None, 'verbose': 0, 'show_optimization': False, 'hpoc': None, 'hpoc_file': None, 'wbtable': None, 'exp_auto_dp': False}