Keras Applications Accuracy Calculation#
Introduction#
In this tutorial, our goal is to compare the performance of Keras models on the MXA accelerator versus a CPU. We will be using models from Keras Applications .
Note
The MXA accelerator has on-chip memory, so it currently supports only models with fewer than 40 million parameters.
This tutorial assumes a four-chip solution is correctly connected.
Download & Run
Download
This tutorial provides a high-level overview of the application’s key components. To run the full application, download the complete code package and the compiled DFP. After downloading, refer to the Run section below for step-by-step instructions.
Run
Install Requirements
Before running the application, ensure the following dependencies are installed:
pip install opencv-python==4.11.0.86
Run Command
If you do not have the dataset downloaded, please run the following script first.
chmod +x get_imagenet_valdata.sh
To run the application, you can choose to execute one of the following commands:
python keras_accuracy.py --model_name 'MobileNet' --num_images 10 # Use the MobileNet model to calculate accuracy on 10 images
python keras_accuracy.py --model_name 'ResNet50' --backend 'mxa' --dfp 'ResNet50.dfp' # Calculate accuracy using the resnet model on the entire dataset while providing a path to an existing dfp file
python keras_accuracy.py --model_name 'MobileNet' --num_images 100 --backend 'cpu' # Calculate accuracy using 100 images and the mobilenet model on a cpu backend
1. ImageNet Dataset Information#
We will use the ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) dataset. Follow the instructions at this link to download and examine the data.
Once the data is downloaded, you can run the following code to count the number of images:
ls -1 *.JPEG 2>/dev/null | wc -l # Should return 50000
2. Understanding the Code#
Let’s walk through some key functions to understand how data is loaded, processed, and then passed to the model to generate predictions. We will also cover how predictions are processed before calculating accuracy.
2.1 Image Loading and Preprocessing#
Hint
We use the OpenCV library to load JPEG images. You can install it using:
pip install opencv-python==4.11.0.86
The following Python code loads and preprocesses the images into numpy arrays for inference:
class ImageBatchIterator:
"""
The ImageBatchIterator class contains functions that reads images in batchs from the Imagenet dataset and processes them before sending it to the
model for inference. The functions are especially useful while running inference on the CPU.
"""
def __init__(self, image_paths, batch_size, image_height, image_width, module):
self.image_paths = image_paths
self.batch_size = batch_size
self.index = 0
self.total_images = len(image_paths)
self.image_height = image_height
self.image_width = image_width
self.module = module
def __iter__(self):
return self
def __next__(self):
if self.index >= self.total_images:
raise StopIteration
# Get the current batch of image paths
batch_paths = self.image_paths[self.index : self.index + self.batch_size]
batch_images = []
# Load and preprocess each image in the batch
for img_path in batch_paths:
image_string = tf.io.read_file(img_path)
image = tf.image.decode_jpeg(image_string, channels=3)
# Resize and Center Crop (https://github.com/keras-team/keras/issues/15822#issuecomment-1027178496)
size = self.image_height
h, w = tf.shape(image)[0], tf.shape(image)[1]
ratio = (tf.cast(size, tf.float32) / tf.cast(tf.minimum(h, w), tf.float32))
h = tf.cast(tf.round(tf.cast(h, tf.float32) * ratio), tf.int32)
w = tf.cast(tf.round(tf.cast(w, tf.float32) * ratio), tf.int32)
image = tf.image.resize(image, [h, w])
top, left = (h - size) // 2, (w - size) // 2
image = tf.image.crop_to_bounding_box(image, top, left, size, size)
# Additional preprocessing based on the model provided
image = self.module.preprocess_input(image)
image = np.expand_dims(np.array(image), axis=0)
batch_images.append(image)
# Stack images into a batch array
batch_images = np.vstack(batch_images)
self.index += self.batch_size
# Preprocess the batch
return batch_images
def load_images_and_labels():
with open(ground_truth_path, 'r') as f:
ground_truth = f.read().split('\n')[:-1]
image_paths = glob.glob(imagenet_path+'/*.JPEG')
image_paths.sort()
return image_paths, ground_truth
The load_images_and_labels
function loads the ImageNet 2012 validation images along with their corresponding labels.
Note
There are two preprocessing functions: one for the CPU and another for the MXA. The function preprocess_images(image_paths, model_name, module, module_name)
handles images for the MXA.
2.2 Loading the Model#
The model specified in the command-line argument is loaded as a Keras model with pre-trained weights, ready for inference. Let’s take a quick look at how this is done:
def get_keras_module_name(model_name):
# Get the keras preprocessing module name for the model
keras_preprocessing_module_name = None
for k,v in application_library.items():
if model_name in v:
keras_preprocessing_module_name = k
break
if keras_preprocessing_module_name is None:
raise ValueError('Unknown model. Please refer to https://keras.io/api/applications/ for the list of models.')
return keras_preprocessing_module_name
def get_keras_module_and_model(module_name, model_name):
# Note: Regnetx models are present in the older version of keras ie Keras 2
if module_name == 'regnet' or module_name == 'efficientnet':
module = getattr(tf_keras.applications, module_name)
model = getattr(module, model_name)(weights = 'imagenet')
else:
module = getattr(tf.keras.applications, module_name)
model = getattr(module, model_name)(weights = 'imagenet')
return module, model
2.3 Running Inference#
Next, we’ll run inference on the MXA and compare the results against the CPU. The MX3’s Async API is the most straightforward way to utilize the accelerator fully. Simply connect an input and output function, and the API will handle threading and data streaming behind the scenes.
def run_inference_cpu(image_paths, ground_truth, new_height, new_width, module, model):
batch_size = 128 # Set batch size
# Create the iterator
image_iterator = ImageBatchIterator(image_paths, batch_size, new_height, new_width, module)
# Collect predictions
cpu_outputs = []
start = time.time()
# Iterate through the iterator using a for loop
for batch in image_iterator:
batch_preds = model.predict(batch)
cpu_outputs.extend(batch_preds)
cpu_inference_time = time.time() - start
cpu_outputs = np.stack([np.squeeze(arr) for arr in cpu_outputs])
cpu_predictions = module.decode_predictions(cpu_outputs, top=5)
get_accuracy(cpu_predictions, ground_truth)
print("CPU Inference time: {:.1f} msec".format(cpu_inference_time*1000))
def run_inference_mxa(image_paths, ground_truth, user_provided_model_name, module, module_name, user_provided_model_dfp, model):
def process_output(*outputs):
mxa_outputs.append(np.squeeze(outputs[0], 0))
def preprocess_images():
expected_input_shape = get_expected_model_input_shape(user_provided_model_name, module_name)
new_height, new_width, _ = expected_input_shape
images = []
for img_path in image_paths:
image_string = tf.io.read_file(img_path)
image = tf.image.decode_jpeg(image_string, channels=3)
# Resize and Center Crop (https://github.com/keras-team/keras/issues/15822#issuecomment-1027178496)
size = new_height
h, w = tf.shape(image)[0], tf.shape(image)[1]
ratio = (tf.cast(size, tf.float32) / tf.cast(tf.minimum(h, w), tf.float32))
h = tf.cast(tf.round(tf.cast(h, tf.float32) * ratio), tf.int32)
w = tf.cast(tf.round(tf.cast(w, tf.float32) * ratio), tf.int32)
image = tf.image.resize(image, [h, w])
top, left = (h - size) // 2, (w - size) // 2
image = tf.image.crop_to_bounding_box(image, top, left, size, size)
# Additional preprocessing based on the model provided
image = tf.expand_dims(image, 0)
image = module.preprocess_input(image)
yield np.array(image)
# Check if the user provided a DFP or else compile the model and generate the DFP
if user_provided_model_dfp:
dfp = user_provided_model_dfp
else:
# Compile model and generate DFP
dfp = compile_model(model)
# Test on MXA
mxa_outputs = []
accl = AsyncAccl(dfp = dfp)
start = time.time()
accl.connect_input(preprocess_images)
accl.connect_output(process_output)
accl.wait()
# Postprocess the outputs
mxa_outputs = np.stack([arr for arr in mxa_outputs])
mxa_inference_time = time.time() - start
mxa_predictions = module.decode_predictions(mxa_outputs, top=5)
# Display results
get_accuracy(mxa_predictions, ground_truth)
print("MXA Inference time: {:.1f} msec".format(mxa_inference_time*1000))
2.4 Predictions#
Now that we have outputs from both the CPU and the MXA, let’s decode and process them before calculating accuracy:
cpu_outputs = np.stack([np.squeeze(arr) for arr in cpu_outputs])
cpu_predictions = module.decode_predictions(cpu_outputs, top=5)
mxa_outputs = np.stack([arr for arr in mxa_outputs])
mxa_predictions = module.decode_predictions(mxa_outputs, top=5)
Finally, let’s compare the predictions to the ground truth to calculate the accuracy:
get_accuracy(cpu_predictions, ground_truth)
print("CPU Inference time: {:.1f} msec".format(cpu_inference_time*1000))
get_accuracy(mxa_predictions, ground_truth)
print("MXA Inference time: {:.1f} msec".format(mxa_inference_time*1000))
After running the script, you should see the Top-1 and Top-5 accuracy values along with the MXA and CPU inference times.
3. Running the Script#
To run the script, type the following command in your terminal. The model name provided in the –model_name argument should match the name from the Keras Applications .
You can also specify the number of images to run inference on using the -num_images argument. By default, all 50,000 images in the dataset will be used. You can choose to run inference on the CPU or MXA by setting the –backend argument. If you already have a DFP and wish to use it instead of compiling a new one, provide the DFP file path using the –dfp argument.
Example commands:
python keras_accuracy.py --model_name 'MobileNet' --num_images 10 --backend 'mxa' --dfp 'MobileNet.dfp'
4. Model Accuracy Table#
Below is a table listing various Keras models, their Top-1 and Top-5 accuracy scores, and links to download the DFP for each model if you want to skip the compilation step:
Model |
Top 1 CPU |
Top 1 MXA |
Top 5 CPU |
Top 5 MXA |
DFPs |
---|---|---|---|---|---|
DenseNet121 |
74.65 |
74.48 |
92.19 |
91.95 |
|
DenseNet169 |
76.05 |
75.83 |
93.02 |
92.88 |
|
DenseNet201 |
76.99 |
76.87 |
93.51 |
93.29 |
|
EfficientNetB0 |
76.99 |
76.67 |
93.4 |
93.17 |
|
EfficientNetB1 |
78.75 |
78.66 |
94.26 |
94.15 |
|
EfficientNetB2 |
79.79 |
79.74 |
94.87 |
94.72 |
|
EfficientNetV2B0 |
78.41 |
78.3 |
94.29 |
94.15 |
|
EfficientNetV2B1 |
79.67 |
79.5 |
94.8 |
94.64 |
|
EfficientNetV2B2 |
80.48 |
80.39 |
95.19 |
95.09 |
|
EfficientNetV2B3 |
81.79 |
81.76 |
95.81 |
95.62 |
|
InceptionV3 |
77.76 |
77.48 |
93.85 |
93.51 |
|
MobileNet |
70.65 |
68.82 |
89.59 |
88.23 |
|
MobileNetV2 |
71.91 |
70.15 |
90.59 |
89.35 |
|
RegNetX002 |
66.53 |
65.78 |
87.4 |
86.87 |
|
RegNetX004 |
70.86 |
70.28 |
89.74 |
89.5 |
|
RegNetX006 |
71.92 |
71.82 |
90.59 |
90.55 |
|
RegNetX008 |
73.01 |
72.96 |
91.39 |
91.32 |
|
RegNetX016 |
75.49 |
75.22 |
92.78 |
92.67 |
|
RegNetY002 |
67.72 |
66.92 |
87.98 |
87.58 |
|
RegNetY004 |
71.72 |
71.37 |
90.36 |
90.1 |
|
RegNetY006 |
73.21 |
72.72 |
91.53 |
91.28 |
|
RegNetY008 |
74.25 |
74.01 |
91.99 |
91.88 |
|
RegNetY016 |
76.59 |
76.4 |
93.36 |
93.19 |
|
ResNet50 |
74.32 |
74.24 |
91.74 |
91.52 |
|
ResNet50V2 |
75.3 |
75.2 |
92.58 |
92.53 |
|
Xception |
78.95 |
78.9 |
94.49 |
94.33 |
5. Third-Party License#
This tutorial uses third-party models available through the Keras Applications API. Below are the details of the licenses for these dependencies:
Models: Models sourced from the Keras Applications API
License: Apache License 2.0
6. Summary#
This tutorial demonstrates how to replicate Keras Applications’ accuracy comparison on the ImageNet dataset while comparing performance between the MXA and the CPU.
You can download the full script below. Please refer to the README.md
file included in the download for more information on how to run the application.
See also