Keras Applications Accuracy Calculation#

Introduction#

In this tutorial, our goal is to compare the performance of Keras models on the MXA accelerator versus a CPU. We will be using models from Keras Applications .

Note

  • The MXA accelerator has on-chip memory, so it currently supports only models with fewer than 40 million parameters.

  • This tutorial assumes a four-chip solution is correctly connected.

ImageNet Dataset Information#

We will use the ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) dataset. Follow the instructions at this link to download and examine the data.

Once the data is downloaded, you can run the following code to count the number of images:

ls -1 *.JPEG 2>/dev/null | wc -l  # Should return 50000

Understanding the Code#

Let’s walk through some key functions to understand how data is loaded, processed, and then passed to the model to generate predictions. We will also cover how predictions are processed before calculating accuracy.

Image Loading and Preprocessing#

Hint

We use the OpenCV library to load JPEG images. You can install it using:

pip install opencv-python

The following Python code loads and preprocesses the images into numpy arrays for inference:


# Specify data paths
imagenet_path = 'ImageNet2012_50k' 
ground_truth_path = 'ground_truth'


class ImageBatchIterator:
    """
    The ImageBatchIterator class contains functions that reads images in batchs from the Imagenet dataset and processes them before sending it to the
    model for inference. The functions are especially useful while running inference on the CPU.
    """

    def __init__(self, image_paths, batch_size, image_height, image_width, module):

        self.image_paths = image_paths
        self.batch_size = batch_size
        self.index = 0
        self.total_images = len(image_paths)
        self.image_height = image_height
        self.image_width = image_width
        self.module = module

    def __iter__(self):
        return self

    def __next__(self):
      
        if self.index >= self.total_images:
            raise StopIteration

        # Get the current batch of image paths
        batch_paths = self.image_paths[self.index : self.index + self.batch_size]
        batch_images = []

        # Load and preprocess each image in the batch
        for img_path in batch_paths:
            
            image_string = tf.io.read_file(img_path)
            image = tf.image.decode_jpeg(image_string, channels=3)

            # Resize and Center Crop (https://github.com/keras-team/keras/issues/15822#issuecomment-1027178496)
            size = self.image_height

            h, w = tf.shape(image)[0], tf.shape(image)[1]
            ratio = (tf.cast(size, tf.float32) / tf.cast(tf.minimum(h, w), tf.float32))
            h = tf.cast(tf.round(tf.cast(h, tf.float32) * ratio), tf.int32)
            w = tf.cast(tf.round(tf.cast(w, tf.float32) * ratio), tf.int32)
            image = tf.image.resize(image, [h, w])
            
            top, left = (h - size) // 2, (w - size) // 2
            image = tf.image.crop_to_bounding_box(image, top, left, size, size)
        
            # Additional preprocessing based on the model provided
            image = self.module.preprocess_input(image)
            image = np.expand_dims(np.array(image), axis=0)
            batch_images.append(image)

        # Stack images into a batch array
        batch_images = np.vstack(batch_images)
        self.index += self.batch_size

        # Preprocess the batch
        return batch_images


### Create helper functions

def load_images_and_labels():
    
    with open(ground_truth_path, 'r') as f:
        ground_truth = f.read().split('\n')[:-1]

    image_paths = glob.glob(imagenet_path+'/*.JPEG')
    image_paths.sort()
    
    return image_paths, ground_truth


The load_images_and_labels function loads the ImageNet 2012 validation images along with their corresponding labels.

Note

There are two preprocessing functions: one for the CPU and another for the MXA. The function preprocess_images(image_paths, model_name, module, module_name) handles images for the MXA.

Loading the Model#

The model specified in the command-line argument is loaded as a Keras model with pre-trained weights, ready for inference. Let’s take a quick look at how this is done:


def get_keras_module_name(model_name):

    # Get the keras preprocessing module name for the model
    keras_preprocessing_module_name = None

    for k,v in application_library.items():

        if model_name in v:
            keras_preprocessing_module_name = k
            break

    if keras_preprocessing_module_name is None:
        raise ValueError('Unknown model. Please refer to https://keras.io/api/applications/ for the list of models.')

    return keras_preprocessing_module_name


def get_keras_module_and_model(module_name, model_name):

    # Note: Regnetx models are present in the older version of keras ie Keras 2
    if module_name == 'regnet':
        module = getattr(tf_keras.applications, module_name)
        model = getattr(module, model_name)(weights = 'imagenet')   

    else:
        module = getattr(tf.keras.applications, module_name)
        model = getattr(module, model_name)(weights = 'imagenet')

    return module, model
  

Running Inference#

Next, we’ll run inference on the MXA and compare the results against the CPU. The MX3’s Async API is the most straightforward way to utilize the accelerator fully. Simply connect an input and output function, and the API will handle threading and data streaming behind the scenes.


def run_inference_cpu(image_paths, ground_truth, new_height, new_width, module, model):

    batch_size = 128  # Set batch size

    # Create the iterator
    image_iterator = ImageBatchIterator(image_paths, batch_size, new_height, new_width, module) 

    # Collect predictions
    cpu_outputs = []

    start = time.time()

    # Iterate through the iterator using a for loop
    for batch in image_iterator:
        
        batch_preds = model.predict(batch)
        cpu_outputs.extend(batch_preds)

    cpu_inference_time = time.time() - start
    cpu_outputs = np.stack([np.squeeze(arr) for arr in cpu_outputs])
    cpu_predictions = module.decode_predictions(cpu_outputs, top=5)

    get_accuracy(cpu_predictions, ground_truth)
    print("CPU Inference time: {:.1f} msec".format(cpu_inference_time*1000))


def run_inference_mxa(image_paths, ground_truth, user_provided_model_name, module, module_name, user_provided_model_dfp, model):

    def process_output(*outputs):
        mxa_outputs.append(np.squeeze(outputs[0], 0))

    def preprocess_images():

        expected_input_shape = get_expected_model_input_shape(user_provided_model_name, module_name)
        new_height, new_width, _ = expected_input_shape

        images = []

        for img_path in image_paths:

            image_string = tf.io.read_file(img_path)
            image = tf.image.decode_jpeg(image_string, channels=3)

            # Resize and Center Crop (https://github.com/keras-team/keras/issues/15822#issuecomment-1027178496)
            size = new_height

            h, w = tf.shape(image)[0], tf.shape(image)[1]
            ratio = (tf.cast(size, tf.float32) / tf.cast(tf.minimum(h, w), tf.float32))
            h = tf.cast(tf.round(tf.cast(h, tf.float32) * ratio), tf.int32)
            w = tf.cast(tf.round(tf.cast(w, tf.float32) * ratio), tf.int32)
            image = tf.image.resize(image, [h, w])
            
            top, left = (h - size) // 2, (w - size) // 2
            image = tf.image.crop_to_bounding_box(image, top, left, size, size)

            # Additional preprocessing based on the model provided
            image = module.preprocess_input(image)
    
            yield np.array(image)
    

    # Check if the user provided a DFP or else compile the model and generate the DFP
    if user_provided_model_dfp:
        dfp = user_provided_model_dfp

    else:
        # Compile model and generate DFP
        dfp = compile_model(model)

    # Test on MXA
    mxa_outputs = []

    accl = AsyncAccl(dfp = dfp)
    start = time.time()
    accl.connect_input(preprocess_images)
    accl.connect_output(process_output)
    accl.wait()

    # Postprocess the outputs
    mxa_outputs = np.stack([np.squeeze(arr) for arr in mxa_outputs])
    mxa_inference_time = time.time() - start
    mxa_predictions = module.decode_predictions(mxa_outputs, top=5)

    # Display results
    get_accuracy(mxa_predictions, ground_truth)
    print("MXA Inference time: {:.1f} msec".format(mxa_inference_time*1000))


Predictions#

Now that we have outputs from both the CPU and the MXA, let’s decode and process them before calculating accuracy:

    cpu_outputs = np.stack([np.squeeze(arr) for arr in cpu_outputs])
    cpu_predictions = module.decode_predictions(cpu_outputs, top=5)

    mxa_outputs = np.stack([np.squeeze(arr) for arr in mxa_outputs])
    mxa_predictions = module.decode_predictions(mxa_outputs, top=5)

Finally, let’s compare the predictions to the ground truth to calculate the accuracy:

    get_accuracy(cpu_predictions, ground_truth)
    print("CPU Inference time: {:.1f} msec".format(cpu_inference_time*1000))

    get_accuracy(mxa_predictions, ground_truth)
    print("MXA Inference time: {:.1f} msec".format(mxa_inference_time*1000))

After running the script, you should see the Top-1 and Top-5 accuracy values along with the MXA and CPU inference times.

Running the Script#

To run the script, type the following command in your terminal. The model name provided in the –model_name argument should match the name from the Keras Applications .

You can also specify the number of images to run inference on using the -num_images argument. By default, all 50,000 images in the dataset will be used. You can choose to run inference on the CPU or MXA by setting the –backend argument. If you already have a DFP and wish to use it instead of compiling a new one, provide the DFP file path using the –dfp argument.

Example commands:

python keras_accuracy.py --model_name 'MobileNet' --num_images 10 --backend 'mxa' --dfp 'MobileNet.dfp'

Model Accuracy Table#

Below is a table listing various Keras models, their Top-1 and Top-5 accuracy scores, and links to download the DFP for each model if you want to skip the compilation step:

Accuracy Table#

Model

Top 1 CPU

Top 1 MXA

Top 5 CPU

Top 5 MXA

DFPs

DenseNet121

74.65

74.48

92.19

91.95

DenseNet121.dfp

DenseNet169

76.05

75.83

93.02

92.88

DenseNet169.dfp

DenseNet201

76.99

76.87

93.51

93.29

DenseNet201.dfp

EfficientNetB0

76.99

76.67

93.4

93.17

EfficientNetB0.dfp

EfficientNetB1

78.75

78.66

94.26

94.15

EfficientNetB1.dfp

EfficientNetB2

79.79

79.74

94.87

94.72

EfficientNetB2.dfp

EfficientNetV2B0

78.41

78.3

94.29

94.15

EfficientNetV2B0.dfp

EfficientNetV2B1

79.67

79.5

94.8

94.64

EfficientNetV2B1.dfp

EfficientNetV2B2

80.48

80.39

95.19

95.09

EfficientNetV2B2.dfp

EfficientNetV2B3

81.79

81.76

95.81

95.62

EfficientNetV2B3.dfp

InceptionV3

77.76

77.48

93.85

93.51

InceptionV3.dfp

MobileNet

70.65

68.82

89.59

88.23

MobileNet.dfp

MobileNetV2

71.91

70.15

90.59

89.35

MobileNetV2.dfp

RegNetX002

66.53

65.78

87.4

86.87

RegNetX002.dfp

RegNetX004

70.86

70.28

89.74

89.5

RegNetX004.dfp

RegNetX006

71.92

71.82

90.59

90.55

RegNetX006.dfp

RegNetX008

73.01

72.96

91.39

91.32

RegNetX008.dfp

RegNetX016

75.49

75.22

92.78

92.67

RegNetX016.dfp

RegNetY002

67.72

66.92

87.98

87.58

RegNetY002.dfp

RegNetY004

71.72

71.37

90.36

90.1

RegNetY004.dfp

RegNetY006

73.21

72.72

91.53

91.28

RegNetY006.dfp

RegNetY008

74.25

74.01

91.99

91.88

RegNetY008.dfp

RegNetY016

76.59

76.4

93.36

93.19

RegNetY016.dfp

ResNet50

74.32

74.24

91.74

91.52

ResNet50.dfp

ResNet50V2

75.3

75.2

92.58

92.53

ResNet50V2.dfp

Xception

78.95

78.9

94.49

94.33

Xception.dfp

Third-Party License#

This tutorial uses third-party models available through the Keras Applications API. Below are the details of the licenses for these dependencies:

Summary#

This tutorial demonstrates how to replicate Keras Applications’ accuracy comparison on the ImageNet dataset while comparing performance between the MXA and the CPU.

You can download the full script here: keras_accuracy.py