Grad-CAM Implementation in pycaffe

26 comments
You can find the code discussed in this post in this git repository.

This post discusses how to implement Gradient-weighted Class Activation Mapping (Grad-CAM) approach discussed in the paper Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization.

Grad-CAM is a technique that makes Convolutional Neural Network (CNN) based models more interpretable by visualizing input regions where the model would look at while making a predictions.

Grad-CAM model architecture

I'm not going to go deeper in the paper, for a more detailed explanation please refer the paper.

You can find different implementations of this technique in KerasTorch+Caffe, and Tensorflow.
However, I was not able to find pycaffe implementation of GradCAM in the web. As pycaffe is a commonly used deep learning framework for CNN based classification model development, it would be useful to have a pycaffe implementation as well.

If you are looking for a quick solution to interpret your Caffe classification model, this post is for you!

Install

If you are completely new to Caffe, refer the Caffe official page for installation instructions and some tutorials. As we are going to use python interface to Caffe (pycaffe), make sure you install pycaffe as well. All the required instructions are given in the Caffe web site.

Implementation

For this implementation I'm using a pretrained image classification model downloaded from the community in Caffe Model Zoo.

For this example, I will use BVLC reference caffenet model which is trained to classify images into 1000 classes. To download the model, go to the folder where you installed Caffe, e.g. C:\Caffe and run

 ./scripts/download_model_binary.py models/bvlc_reference_caffenet
./data/ilsvrc12/get_ilsvrc_aux.sh

Then let's write the gradCAM.py script

 #load the model
net = caffe.Net('---path to caffe installation folder---/models/bvlc_reference_caffenet/deploy.prototxt',
                '---path to caffe installation folder---/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel',
                caffe.TEST)

# load input and preprocess it
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_mean('data', np.load('--path to caffe installation folder--/python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1))
transformer.set_transpose('data', (2,0,1))
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)

#We reshape the image as we classify only one image
net.blobs['data'].reshape(1,3,227,227)

#load the image to the data layer of the model
im = caffe.io.load_image('--path to caffe installation folder--/examples/images/cat.jpg')
net.blobs['data'].data[...] = transformer.preprocess('data', im)

#classify the image
out = net.forward()

#predicted class
print (out['prob'].argmax())

Next we have to calculate the gradient of the predicted class socre w.r.t to the convolution layer of interest. This is the tricky part. Caffe framework provides an inbuilt function

 net.backward()

to calculate gradients of the network. However, if you study the documentation of backward() function you would understand that, this method calculates gradients of  loss w.r.t. input layer (or as commonly used in Caffe 'data' layer).

To implement Grad-CAM we need gradients of the layer just before the softmax layer with respect to a convolution layer, preferably the last convolution layer. To achieve this you have to modify the deploy.prototxt file. You just have to remove the softmax layer and add following line just after the model name.

 force_backward: true

Then by using the following code snippet we can derive Grad-CAM


final_layer = "fc8" #output layer whose gradients are being calculated
image_size = (227,227) #input image size
feature_map_shape = (13, 13) #size of the feature map generated by 'conv5'
layer_name = 'conv5' #convolution layer of interest
category_index = out['fc8'].argmax() #-if you want to get the saliency map of predicted class or else you can get saliency map for any interested class by specifying here

#Make the loss value class specific    
label = np.zeros(input_model.blobs[final_layer].shape)
label[0, category_index] = 1    

imdiff = net.backward(diffs= ['data', layer_name], **{input_model.outputs[0]: label}) 
gradients = imdiff[layer_name] #gradients of the loss value/ predicted class score w.r.t conv5 layer

#Normalizing gradients for better visualization
gradients = gradients/(np.sqrt(np.mean(np.square(gradients)))+1e-5)
gradients = gradients[0,:,:,:]

print("Gradients Calculated")

activations = net.blobs[layer_name].data[0, :, :, :] 

#Calculating importance of each activation map
weights = np.mean(gradients, axis=(1, 2))

cam = np.ones(feature_map_shape, dtype=np.float32)

for i, w in enumerate(weights):
    cam += w * activations[i, :, :]    

#Let's visualize Grad-CAM
cam = cv2.resize(cam, image_size)
cam = np.maximum(cam, 0)
heatmap = cam / np.max(cam)
cam = cv2.applyColorMap(np.uint8(255 * heatmap), cv2.COLORMAP_JET) 

#We are going to overlay the saliency map on the image
new_image = cv2.imread(''--path to caffe installation folder--/examples/images/cat.jpg'')
new_image = cv2.resize(new_image, image_size)

cam = np.float32(cam) + np.float32(new_image)
cam = 255 * cam / np.max(cam)
cam = np.uint8(cam)

#Finally saving the result
cv2.imwrite("gradcam.jpg", cam) 

That's it. If everything goes smoothly you will get the following result.



Input Image

Grad-CAM image

Hope this will be helpful. If you need any clarification please feel free to comment below, I'm happy to help you.






26 comments :

Post a Comment