Computer Vision and Deep Learning -Part 4

Published in

Analytics Vidhya

9 min readSep 25, 2021

Introduction:

Welcome back to the series of computer vision and deep learning. In the past few posts you have understood what an image is all about, how to perform some basic operations like resizing image, changing it’s color space and the knowledge of gradients to determine boundaries in the image. That is, all the operations we have performed so far required zero knowledge of the subject by the machine to play with an image.

Fig 4.1 Beautiful Peacock Image credit: Visit

By subject I mean, (refer fig 4.1) the knowledge of a peacock standing on a rock under a tree. As soon as a human look at the image, these are the first few elements one would observe in the image. The human ability to recognize what is in an image is associated with human intelligence. To provide machine this ability to recognize we need artificial intelligence. But before i direct the flow of these posts completely towards artificial intelligence, I want to share some fabulous algorithms related to 2D Features framework introduced by geeks overtime to understand an image. Why to use a complex method when same problem can be solved with simplicity? Artificial intelligence applications are computationally heavy hence avoid it’s use whenever possible, your life will be easier.

2D Features Framework:

As the title suggests, we will play around with the features of a 2D image. This method comes handy while dealing with object recognition and object tracking. Recall, we have learnt about gradients in the previous post. The knowledge of sharp variation in the intensity of the pixels as you navigate in any one direction tells us about the presence of edge.

Fig 4.2 Vector Representation. Image credit : Visit

In order to store and further manipulate, this data(gradient detail)is stored in forms of vectors. The direction of a vector tells us the direction of pixel intensity change in the image and the length of the vector tells us about the magnitude by which the intensity is varying.

Fig 4.3 Key areas and Feature explanation. Image credits: Visit

Consider the figure 4.3, As one would observe, the main variation in pixel intensity is in the left side of the image. Center and right side of the image has constant pixel intensity. Hence, we can conclude the key area of the image is left side. Components of an image which are to be taken in further use are called features. In the image shown, wings, tail, head, beak or legs of the bird are features which we might need to extract or match in the later sections.

Fig 4.4 An 8x8 pixel section of the image is considered and this 8x8 box is further divided into 4 blocks each of 4x4 dimensions. Inside every 4x4 block, image gradient is represented in the form of vectors. Key points are found in the image by searching for most unique or distinct features. Here, Key point Descriptor is formed by combining 4 adjacent vectors. Key point descriptors shows the direction and magnitude of gradient change in that section of image. The area around the key point is normalized and a local descriptor is calculated for the key point area. The local descriptor is a vector of numbers that describes the visual appearance of the key point. Image credit : Visit

Main subsections of a 2D Feature Framework are:

Feature Detection
Feature Matching

If you see an image at lower level, i.e. without focusing on the subject of an image, prominent features for an image will be corners, edges and consistent part with least pixel intensity variation. Among these, most important will be a corner. Any guess why? You are right if you have guessed because it’s the intersection point of two edges. Every corner in a rectangle(image is rectangular) has unique pattern of intensity variation (you can think up in form of bottom-left corner as caps l(L, that is sharp intensity change in right to left and top to bottom navigation across image), bottom right as mirror image of caps L and upper two in similar way ). A corner is a point which has some serious gradient change in two direction because of adjacent edges. Hence, Opencv provides some really cool methods to detect corner in an image. Let’s list them

Feature Detection Methods:

a) Harris Corner Detection

b) Shi-Tomasi Corner Detection

c) ORB(Oriented Fast and Rotated Brief)

d) FAST algorithm for corner detection

e) SIFT(Scale Invariant Feature Transform) — PATENTED(Pay and Use)

f) SURF(Speeded Up Robust Features) — PATENTED(Pay and Use)

(SIFT and SURF are Patented and not available for free commercial use, hence we will skip demonstration of these 2 methods)

Fig 4.5 Image to be used for corner detection algorithm implementation. Image Credits: Visit

Harris Corner Detection:

Let’s understand what this equation mean. E(u, v) is a function which needs to be maximized to detect corner.The equation basically finds the difference in intensity for a displacement of (u,v) in all directions. For example, if we take u as 2 and v as 3, we will keep varying the (x,y) values and calculate the squared difference in the pixel intensity. The window function to which the intensity difference is being multiplied is kernel(refer previous if you don’t understand kernel), here rectangular or Gaussian window which gives weight to the pixel underneath.

The E(u, v) equation is solved using Taylor expansion to obtain the above equation where M stands for

Here Ix and Iy stands for image derivatives in x and y directions

Main part of the equation is R which is

R decides whether pixels underneath contains corner, edges or plain image.

If we consider L1 and L2 as the Eigen values of the matrix M, det(M )= product of the Eigen values(L1 x L2) and trace(M)= sum of the Eigen values(L1+L2). As per the method, R’s value decides the content of region under test. If

Fig 4.6 Graphical Representation of Harris corner Detection

Magnitude of R is small, region under consideration is flat(L1 and L2 small)
R is negative, region under consideration contains edge(when L1 is too big when compared to L2 or vice versa)
R is large, region contains corner(L1 and L2 large and comparable)

Harris Corner Detection Code:

import cv2
import numpy as np#read the original image
cv_image = cv2.imread("/home/rupali/tutorials/blue_flower.jpg")#convert into gray scale
gray_image = cv2.cvtColor(cv_image, cv2.COLOR_BGR2GRAY)#convert image pixel data into float32, to avoid further size not compatible clashes
gray_image = np.float32(gray_image)#syntax cv2.corenrHarris(input_image, block size for neighborhood pixels to be considered, sobel operator size, border type)
result_image = cv2.cornerHarris(gray_image, blockSize=2, ksize= 3, k =0.04)#dilate to highlight corners
result_image = cv2.dilate(result_image, None)#reverting back to original image using optimal threshold 
cv_image[result_image > 0.01* result_image.max()]=[0,0, 255]cv2.imshow("haris", cv_image)
cv2.waitKey()

Shi-Tomasi Corner Detection Method:

This method of corner detection is similar to Harris. Shi-Tomasi in his research paper Good Features to Track proposed to find the N strongest corners in the image. The scoring function is given as

Practically, Shi Tomasi is more effective than Harris. Graphically we can visualize this method as

Shi-Tomasi Method to find strongest N corners

import numpy as np
import cv2#input imagecv_img = cv2.imread('/home/rupali/tutorials/bird.jpg')
cv_gray = cv2.cvtColor(cv_img,cv2.COLOR_BGR2GRAY)#syntax cv2.goodFeaturesToTrack(input_image, max_corner_to_detect, qualityLevel, minDistance)corners = cv2.goodFeaturesToTrack(cv_gray,maxCorners=25, qualityLevel=0.01,minDistance=10)
corners = np.float32(corners)for item in corners:
    x,y = item[0]
    cv2.circle(cv_img,(x,y),3,(0,0,255),-1)cv2.imshow("image", cv_img)
cv2.imwrite("shi_result.jpg", cv_img)
cv2.waitKey()

Result:

ORB(Oriented Fast and Rotated Brief):

This method is efficient alternative to SIFT and SURF. ORB comes into picture mainly because SIFT and SURF are patented, hence the need of an effective open source method resulted in ORB. ORB uses FAST to detect key points and BRIEF to compute image descriptors.

import numpy as np
import cv2
#input image
cv_img = cv2.imread('/home/rupali/tutorials/bird.jpg')
cv_gray = cv2.cvtColor(cv_img,cv2.COLOR_BGR2GRAY)orb = cv2.ORB_create(nfeatures=200)
key_point, descriptors = orb.detectAndCompute(cv_gray, None)keypoint_image = cv2.drawKeypoints(cv_img, key_point, None,color=(0,0,255), flags=0)cv2.imshow("ORB", keypoint_image)
cv2.imwrite("orb.jpg", keypoint_image)
cv2.waitKey()

FAST Algorithm for Corner Detection:

The corner detection techniques discussed above are good but not fast enough for real time applications. As a solution to this, FAST (Features from Accelerated Segment Test) algorithm was proposed by Edward Rosten and Tom Drummond in their paper “Machine learning for high-speed corner detection” in 2006 (later revised it in 2010).

Summary of Features from Accelerated Segment Test is as follows:

Select a pixel p in the image which is to be identified as an interest point or not. Let its intensity be Ip.
Select appropriate threshold value t.
Consider a circle of 16 pixels around the pixel under test (See the image below).
Now the pixel p is a corner if there exists a set of n contiguous pixels in the circle (of 16 pixels) which are all brighter than Ip+t, or all darker than Ip−t (Shown as white dash lines in the above image). n was chosen to be 12.
A high-speed test was proposed to exclude a large number of non-corners. This test examines only the four pixels at 1, 9, 5 and 13 (First 1 and 9 are tested if they are too brighter or darker. If so, then checks 5 and 13). If p is a corner, then at least three of these must all be brighter than Ip+t or darker than Ip−t. If neither of these is the case, then p cannot be a corner. The full segment test criterion can then be applied to the passed candidates by examining all pixels in the circle.

FAST will not perform well where detection of multiple features has to be performed in same region of an image. For this Non-Maximum Suppression is used.

In Non-Maximum Suppression a score function is computed, V for all the detected feature points.

V is the sum of absolute difference between p(main pixel under consideration ) and 16 surrounding pixels values.
Consider two adjacent keypoints and compute their V values.
Discard the one with lower V value.

In a nut shell, FAST is faster than many existing feature detectors but performs poorly in presence of high level of noise. Mainly because the pixel values will be altered because of high-level of noise.

import numpy as np
import cv2
#input image
cv_img = cv2.imread('/home/rupali/tutorials/bird.jpg')
cv_gray = cv2.cvtColor(cv_img,cv2.COLOR_BGR2GRAY)fast = cv2.FastFeatureDetector_create()
fast.setNonmaxSuppression(False)keypoint = fast.detect(cv_gray, None)
keypoint_image = cv2.drawKeypoints(cv_img, keypoint, None, color=(0,0,255))cv2.imshow("FAST", keypoint_image)
cv2.imwrite("fast.jpg", keypoint_image)
cv2.waitKey()

Feature Matching

Opencv documentation mentions two feature matching methods. Namely Brute-Force Matcher and FLANN Matcher.

Brute Force Matcher:

Brute-Force matcher is simple. It takes the descriptor of one feature in first set and is matched with all other features in second set using some distance calculation(multiple ways for distance calculation). And the closest one is returned.

import cv2cv_img1 = cv2.imread('/home/rupali/tutorials/bird.jpg', 0)
cv_img2 = cv2.imread('/home/rupali/tutorials/bird_rotated.jpg', 0)orb = cv2.ORB_create(nfeatures=500)
kp1, des1 = orb.detectAndCompute(cv_img1, None)
kp2, des2 = orb.detectAndCompute(cv_img2, None)# matcher takes normType, which is set to cv2.NORM_L2 for SIFT and SURF, cv2.NORM_HAMMING for ORB, FAST and BRIEF
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)
matches = sorted(matches, key=lambda x: x.distance)# draw first 50 matches
match_img = cv2.drawMatches(cv_img1, kp1, cv_img2, kp2, matches[:50], None)
cv2.imshow('Matches', match_img)
cv2.imwrite("Match.jpg", match_img)
cv2.waitKey()

Brute Force Image feature Matcher between the original image and 300 degrees rotated image

Visibly good performance even in colored space

FLANN MATCHER:

FLANN uses SIFT detector to get the keypoints. In order to avoid SIFT, SURF usage we will implement matching operation by using ORB detectors and by correcting the distorted image.

import cv2
import numpy as npdef get_corrected_img(cv_img1, cv_img2):
    MIN_MATCHES = 50orb = cv2.ORB_create(nfeatures=500)
    kp1, des1 = orb.detectAndCompute(cv_img1, None)
    kp2, des2 = orb.detectAndCompute(cv_img2, None)index_params = dict(algorithm=6,
                        table_number=6,
                        key_size=12,
                        multi_probe_level=2)
    search_params = {}
    flann = cv2.FlannBasedMatcher(index_params, search_params)
    matches = flann.knnMatch(des1, des2, k=2)# As per Lowe's ratio test to filter good matches
    good_matches = []
    for m, n in matches:
        if m.distance < 0.75 * n.distance:
            good_matches.append(m)if len(good_matches) > MIN_MATCHES:
        src_points = np.float32([kp1[m.queryIdx].pt for m in good_matches]).reshape(-1, 1, 2)
        dst_points = np.float32([kp2[m.trainIdx].pt for m in good_matches]).reshape(-1, 1, 2)
        m, mask = cv2.findHomography(src_points, dst_points, cv2.RANSAC, 5.0)
        corrected_img = cv2.warpPerspective(cv_img1, m, (cv_img2.shape[1], cv_img2.shape[0]))return corrected_img
    return img2if __name__ == "__main__":cv_im1 = cv2.imread('/home/rupali/tutorials/bird.jpg')
    cv_im2 = cv2.imread('/home/rupali/tutorials/bird_rotated.jpg')img = get_corrected_img(cv_im2, cv_im1)
    cv2.imshow('Corrected image', img)
    cv2.imwrite("corrected_image.jpg", img)
    
    cv2.waitKey()

300 Degree rotated image presented as per the orientation of the base image.

Conclusion:

So that was it with the core opencv applications! As a reminder,the discussion was restricted to most useful features of opencv. In the upcoming posts I will start with Machine learning in general to why I decided to go ahead with Deep Learning. It’s gonna be fun! Until then, stay tuned and Keep learning!