Computer Vision and Deep Learning- Part 2

Published in

Analytics Vidhya

7 min readFeb 13, 2021

Introduction

In the previous post, I have given a general idea about images. Once you have understood the basics of an image, its time to understand how to play with it.

For a human being, vision plays a significant role to perform daily chores. Alike, robots also need environment information in the form of images. If you have followed the previous post of this series, you might have come across a few colourful images and their presentation in different image format. When we simply “process” the image i.e tweaking parameters like sharpness, stretching etc we are performing image processing(more like preparing an image for further application). Whereas extracting “what is an image’s subject” information is the aim of computer vision. I would first like to introduce the readers to image processing related basic operations that we can perform using OpenCV. The reason why I am introducing image processing concepts in-between is that you need to keep these basic operations handy when we will deal with higher-level problem statements later.

Image Processing — Basic Operations

Tip: Try replicating the code I will be providing and explaining to you by NOT copy-pasting it. Use pip install OpenCV-python on the command line to install OpenCV.

Changing Color Space:

As we discussed, changing colour space i.e. BGR to HSV, HSL or any other image format is application dependent. For colour detection in real-world applications, we often convert BGR to HSV because HSV being insensitive to lighting variation. In the following code, we are targeting to detect an object based on its colour. Keep in mind detection and tracking are two different concepts. In detection, the algorithm will only detect the colour and keep no track of time. That is, It will not care if the object in the previous frame is the same as the current frame object. Whereas a tracker makes sure it is detecting and following the same object in the following frames.

Main Image. Observe how reflection changes the color shade.Image credits : pinterest.com

import cv2
import numpy as npcv_image =cv2.imread("/home/rupali/tutorials/blue_mug.jpg")
#cv2.imread command is used to read an image by providing complete path to the imagehsv = cv2.cvtColor(cv_image, cv2.COLOR_BGR2HSV)
#cv2.cvtColor command converts image colorspace. Here we have provided the image to be converted and required color spcae conversion (from BGR to HSV)lower_blue = np.array([100,50,50])
upper_blue = np.array([140,255,255])
# In the above two lines we have provided range for H(110-130), S(50-255), V(50-255) for only blue color to be selected.mask = cv2.inRange(hsv, lower_blue, upper_blue) 
# Take hsv as input image and use these limits for extractionresult = cv2.bitwise_and(cv_image, cv_image, mask= mask)
# here we have performed bitwise and operation with the cv_image(source 1) and the mask(white patch will appear in the desired color location). The another cv_image (source 2) is used  to bitwise_and that black and white image with the colored image.cv2.imshow('main_image',cv_image)
cv2.imshow('Mask prepared',mask)
cv2.imshow('Result',result)  
cv2.waitKey(10000)

Resizing Image and Image Thresholding:

As mentioned, the role of basic image processing operations is to prepare an image for further use. Resizing a given image and to convert a colour image into binary(either black or white) are a few common practices. A binary image is much easier to play with when compared to a 3 channel image.

Resize an image:

import cv2
import numpy as npcv_image= cv2.imread('/home/rupali/tutorials/tiger.jpg')
#read the image using cv2.imread by providing full pathresize_image = cv2.resize(cv_image,None, fx=0.5,fy=0.5,interpolation=cv2.INTER_CUBIC)
#resize cv_image by scaling x by 0.5 and y by 0.5 , You can fix the output size by changing None to the desired dimesion(try (10,10)). Interpolation means sampling up or down technique to be used for expanding or shrinking the imagecv2.imshow('original_image',cv_image)
cv2.imshow('resize_image',resize_image)
cv2.waitKey(10000)

Resize Result:

Resize image by 0.5 in height and 0.5 in width. Original Image credits: Pinterest.com

Image Thresholding:

Concept of thresholding is pretty straight forward. Threshold means a limit, let’s understand this with an example. Suppose in a test conducted out of marks 100, 40 is the threshold, i.e. If you score greater or equal to 40 you are “pass”. If less than 40, we consider you “fail”. In image processing “pass”(i.e. above threshold) not necessarily mean a fixed colour. It can be black or white, depends on what we are interested in.
In OpenCV, we are provided with a lot of techniques to perform thresholding. There are basic filters like cv2.THRESH_BINARY as well as complex filters like cv2.THRESH_OTSU. So any idea why we need complex filters when the task can be done with a simple one? The answer is Noise, variable light conditions and selection of the correct threshold. What if the question paper of exam was too tough to score 40 marks? Students find it hard to cross the threshold defined, will then setting 40 as threshold a wise choice? Complex filters are built on these basic ideas. Before I proceed to code, I would like to introduce the concept of histograms. Weird stack of sand! Aren’t they?

Let’s try to read the graph with the help of the same example. Students who have scored 50 marks are highest in a population(18 here). Whereas students who have scored as low as 15 or as high as 90 are very less in number(here 1 each). If we apply this concept to an image we will get to know the number of pixels(in place of no. of students) which constitutes the majority population in an image. The graph shown above is unimodal, i.e a graph with just one peak. If there were two peaks, we would call it a bimodal graph.

Thresholding Code:

import cv2 
import numpy as np
from matplotlib import pyplot as pltcv_image = cv2.imread("/home/rupali/tutorials/tiger.jpg",0)
#Read the main image, ) signifies in balck and white scaleret, one = cv2.threshold(cv_image, 127, 255,cv2.THRESH_BINARY)
#basic binary filter, provided lower and upper limit of assigning "pass"#concept of adaptive comes in light when remember exam question paper was too tough? What threshold value to select should depend on neighborhood pixel's intensity valuestwo =cv2.adaptiveThreshold(cv_image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,11,2)
#Syntax cv2.adaptiveThreshold(source_image, maximum value, adaptive method to be used, threshold type, blocksize, constant)
# maximum value- highest intensity value a pixel can contain, how threshold will be calculated depends on adaptive method used,
#threshold value (for adap mean)= (mean of neighborhood(size given by block size)- constant)three =cv2.adaptiveThreshold(cv_image, 255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,11,2)
#Same syntax as above. threshold value(for gaussian) = weighted sum of neighborhood value - constantlabels = ['Original Image(In gray scale)', 'Global Thresholding','Adaptive Mean Thresholding','Adaptive Gaussian Thresholding']
images = [cv_image, one, two, three]for i in range(4):
    plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
    plt.title(labels[i])
    plt.xticks([]),plt.yticks([])
    print(i)
plt.show()cv2.waitKey(10000)

Result:

Complex Methods For Thresholding — Ostu’s Threshold:

As discussed, when an image is bimodal(two peaks in histogram), the selection of a threshold becomes difficult. In such cases, cv2.THRESH_OSTU comes in handy. For mathematical details please refer Ostu’s threshold.

Input image with histogram:

A bimodal image. Pixels with intensity 18 forms highest peak(12047 in number) and with 207 intensity ( 4289 in number). Image credits : kisscc0

Code for Histogram Plot:

import cv2
import numpy as np
from matplotlib import pyplot as pltcv_image=cv2.imread("/home/rupali/tutorials/ostu5.jpg",0)plt.hist(cv_image.ravel(),256,[0,256]); plt.show()

Code for Ostu’s Thresholding:

import cv2
import numpy as np
from matplotlib import pyplot as pltcv_image= cv2.imread("/home/rupali/tutorials/ostu5.jpg",0)one =cv2.adaptiveThreshold(cv_image,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,11,2)
ret2, two = cv2.threshold(cv_image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
blur = cv2.GaussianBlur(cv_image,(5,5),0)
ret3,three = cv2.threshold(blur, 0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)images = [cv_image, 0, one,
          cv_image, 0, two,
          blur, 0, three]titles = ['Original Image','Histogram','Adaptive Mean Thresholding',
          'Original Image','Histogram',"Otsu's Thresholding",
          'Gaussian filtered Image','Histogram',"Otsu's Thresholding of Gaussian Blur Image"]for i in xrange(3):
    plt.subplot(3,3,i*3+1),plt.imshow(images[i*3],'gray')
    plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
    plt.subplot(3,3,i*3+2),plt.hist(images[i*3].ravel(),256)
    plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
    plt.subplot(3,3,i*3+3),plt.imshow(images[i*3+2],'gray')
    plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])
plt.show()cv2.waitKey(1000)

Observe the lowermost histogram closely. Blur helps in removing noise(sharp change in neighboring pixel’s intensity). If the input image had just two shades of gray, you would get two sharp peaks rather than a valley shown above.

Observe the lowermost histogram closely. Blur helps in removing noise(sharp change in neighbouring pixel’s intensity). If the input image had just two shades of grey, you would get two sharp peaks rather than a valley shown above.

The ret3 variable in the Ostu’s Threshold code is the threshold value used by the function to perform threshold. In this case, the threshold value is 104.

Conclusion:

So today we understood how to perform masking(for any colour), resizing an image, perform threshold namely, binary threshold, adaptive mean, adaptive gaussian, Ostu’s threshold and the need to blur an image. I must admit that was a lot to grasp for a newbie. I appreciate your efforts! In the next post of this series, I will introduce a few more basic image processing concept before we attack the deep learning part. Till then, Keep Learning!