Computer Vision and Deep Learning -Part 3

Rupali Garewal
Analytics Vidhya
Published in
9 min readFeb 21, 2021

--

Introduction:

Welcome back to this series of Robotics with computer vision and deep learning being the prime target. In this post, I will introduce you to a bit of advanced level image processing concepts namely image smoothing, morphological transformation, Image gradient and Canny edge detection. Before we get started, believe me, these are simple concepts and if you understand one, another will become predictable.

Image 3.1 Original Image. Image credit : Vix

Consider the image shown above of a happy kid, we will use it to understand the concepts in this post.

Image Smoothing:

Image 3.2 The 10x10 matrix of pixel intensity values shown above are a rough estimate of the values present in the yellow box of the grayscale image. Aim of the matrix is to provide better concept understanding.

Image smoothing is used to remove noise in an image. For removing noise we need to understand what noise is. Consider the image 3.2, observe the yellow-coloured circles in the image closely. Do you find something weird about them? Let’s consider the yellow circle which is around intensity 90 of the image. All the intensity around the pixel is more than 200 whereas this pixel out of nowhere scored 90! One needs to understand the pattern of increase or decrease in intensity. If there is no pattern and a sudden glitch, it’s noise. Now, Consider the red marked pixel intensity. As the sunlight is falling on the arms of the kid, it is shown as a fine white line, represented by the 255 intensity value. The values on the right side of the red marked area have a lower intensity value when compared to the left. Moreover, as you move from right to left(lower blue arrow), we find a continuous decrease in intensity(image getting darker). Whereas as we move from right to left side (upper blue arrow) we find a continuous increase in intensity. You will always find continuity in values until a new entity is introduced in the frame. Observe the pink rectangular box, to the left of 255 is 230(kid in the frame), whereas to the right is 124(green background), this sudden change in intensity (230 →124) has continuity(Pixel intensity is in order of 230 on the left side and after scoring 124 values are continuously decreasing on the right side). Thus, the dramatic change in pixel intensity is the edge of a new object in the frame. So we can conclude, if the change in intensity is out of nowhere it is noise otherwise the changed pixel intensity implies a new edge(boundary) detected.

Kernels and 2D Convolution:

Kernels are the filters that are convoluted with an image to get the desired result. Elements of a kernel are crucial in determining the technique used for smoothing an image. Consider the below images to understand how 2D convolution is performed used 2D kernels.

KERNEL SLIDES FROM LEFT END TO THE RIGHT END OF THE IMAGE(OVER ENTIRE WIDTH OF THE IMAGE)
Image 3.3 AFTER COMPLETION OF EVERY ROW, THE KERNEL SLIDES DOWN FOR ANOTHER ROW TO TRANSVERSE FROM LEFT TO RIGHT

The kernel is shown red for better understanding. Each element of overlapping element of kernel gets multiplied with the respective overlapped pixel and the result obtained from the entire calculation is placed in the yellow-coloured pixel. This way the kernel slides across the entire width and height of the image

Image Smoothing techniques:

a) Average : cv2.blur(input_image, kernel_dimension_for_average) ; here average of all the pixel values under kernel is placed at the yellow pixel

b) Gaussian Blur: cv2.GaussianBlur(input_image, kernel_size, sigmaX[, output_image_for_size_reference[, sigmaY[, borderType]]]) ; Details of a Gaussian filter to be applied.

c)Median Filter : cv2.medianBlur(input_image, size_of_kernel); Effective in removing salt and pepper noise. Median of the values under kernel is obtained and is placed at the yellow pixel’s place.

d) Bilateral Filtering : cv2.bilateralFilter(input_filter, bilateral_filter_width, sigma_color, sigma_space ) ; Bilateral filter is used to retain edges while removing noise. Gaussian filter is used in space domain for noise removal and multiplicative Gaussian filter(function of pixel intensity difference) to retain edges.

Image 3.4 NOISY INPUT IMAGE
import cv2
import numpy as np
cv_image = cv2.imread("/home/rupali/tutorials/noise.jpg")
average_filter = cv2.blur(cv_image,(5,5))
gaussian_filter = cv2.GaussianBlur(cv_image,(5,5),0)
median_filter = cv2.medianBlur(cv_image, 5)
bilateral_f = cv2.bilateralFilter(cv_image,9,75,75)
cv2.imshow("original_image", cv_image)
cv2.imshow("Average Filter", average_filter)
cv2.imshow("Gaussian Filter", gaussian_filter)
cv2.imshow("Median Filter", median_filter)
cv2.imshow("Bilateral Filter", bilateral_f)
cv2.waitKey(1000)
Image 3.5 LEFT AVERAGE FILTER , RIGHT GAUSSIAN FILTER
Image 3.6 LEFT MEDIAN FILTER, RIGHT BILATERAL FILTER(edges are distinctively retained)

MORPHOLOGICAL TRANSFORMATION:

Morphological transformations are performed on an image for its texture enhancement. 2D convolution of the kernel with an image is the crux of these operations. Structuring elements(kernel) can be a rectangle, ellipse, diamond or a circle depends on the application. In the following code, we have used a 5x5 kernel filled with ones.

import cv2
import numpy as np
cv_image = cv2.imread("/home/rupali/tutorials/noise.jpg")
# you can select kernel of your choice, a structuring element can be a circle or ellipse
kernel = np.ones((5,5),np.uint8)
#erosion keeps zero at the yellow pixel(refer to the 2D convolve concept) on convolving the kernel if all image pixels under the kernel are not 1
ero = cv2.erode(cv_image,kernel,iterations = 1)
#Dilation is completely opposite. Even if just one pixel of the image under the kernel is 1, it keeps 1 at the yellow coloured pixel of the kernel
dil = cv2.dilate(cv_image,kernel, iterations = 1)
# open = first erosion and then dilation; opening decreases noise in the image
open_ = cv2.morphologyEx(cv_image, cv2.MORPH_OPEN, kernel)
#close = first dilation and then erosion ;
close_=cv2.morphologyEx(cv_image, cv2.MORPH_CLOSE, kernel)
#morphological gradient; gives outline of the image
gradient = cv2.morphologyEx(open_, cv2.MORPH_GRADIENT, kernel)
cv2.imshow("erosion", ero)
cv2.imshow("Dilate", dil)
cv2.imshow("opening", open_)
cv2.imshow("closing", close_)
cv2.imshow("Gradient", gradient)
cv2.waitKey(1000)

Result:

Image 3.7 (Starting from left)Noise input image, Erosion result, Dilation result
Image 3.8 (Starting from left) Opening result, Closing result, Gradient result

Image Gradients:

Image 3.9 Concept to find slope as dy/dx. Image credits: VivaxSolutions

A gradient is another term used for slope. Recall the concept of finding a slope as dy/dx i.e. change in y magnitude as per small change in the x-direction. In images, to find gradient we consider the change in intensity of the pixel as per small change in either x-direction or y-direction. But one may ask why are we learning about gradient/slope/derivatives? The answer is the edges. Whenever there is a sharp change in pixel intensity value(as discussed in the Image smoothing topic), that is there is a “good” slope, it implies the presence of an edge.
There are various inbuilt functions available in OpenCV to find gradient in an image namely, cv2.Sobel(), cv2.Scharr(), cv2.Laplacian().
All these functions are high-pass filter, i.e. restricting low-frequency data and permitting high-frequency data. In terms of images, high pass filters are those which helps in sharpening an image by enhancing contrast in an image where there is a major change in brightness or darkness and ignoring small variation in brightness or darkness. Hence these function helps in detecting “good” slopes in an image.
Sobel function 2D convolves the input image with a Sobel kernel to obtain the derivative of the input image. It first Gaussian smooths the image and then finds derivative. In particular, to obtain accurate derivative Scharr introduced a 3x3 kernel. We can find derivative in the x-direction as well as y-direction by assigning x and y order.

Sobel Scharr Derivative Code:

import cv2
import numpy as np
#input image in gray scale
cv_image = cv2.imread('/home/rupali/tutorials/happy_kid.jpg',0)
#x order is 1, y order is 1 and ksize = -1 implies using scharr kernel
sobel = cv2.Sobel(cv_image, cv2.CV_64F, 1,1,ksize = 3)
cv2.imshow("sobel", sobel)
cv2.imwrite("sobel.jpg", sobel)
cv2.waitKey(1000)

Results:

Image 3.10 Original Image and It’s Sobel Derivative

Laplacian Derivative:

Formula used for double derivative calculation with respect to x and y

Laplacian of a 2D function(here the happy kid image) in Opencv, cv2.Laplacian() is used. One may wonder why to use double derivative for finding edges when the same was possible with single differentiation? The reason is the second-order derivative have a stronger response to fine details i.e. thin lines(that’s why we first smooth-en an image and then apply Laplacian). The second-order derivative produces a double response at the step change in the grey level(which helps in detecting zero crossings).

Image 3.11 Zero Crossing for second derivative of an image. Image Credits : mipav.cit

Consider image 3.11 shown along with it’s first and second derivative. In the intensity profile of the binary image, the transition between dark and bright is represented in form of a slope. The slope is considered positive when the transition is from dark to bright and negative when vice versa. In the first derivative, these edges(where black meets white) is represented in form of positive(black to white) and negative(white to black) spikes. When we take the second derivative of the same function, we find zero-crossings wherever edges are present in the image.
For successful image enhancement, we need to combine techniques for the best results. No one technique is superior to another, the detailing in an image decides what combination will give the best enhancement results.

For successful image enhancement we need to combine techniques for best results. No one technique is superior to another, the detailing in an image decides what combination will give best enhancement results.

Laplacian Code:

import cv2
import numpy as np
cv_image = cv2.imread("/home/rupali/tutorials/happy_kid.jpg")
cv_image = cv2.GaussianBlur(cv_image,(3,3),0)
#resultant image is stored in the datatype 64F or 16S in order to retain positive as well as the negative values. If we directly store the output in 8U format then the negative slope values of an image will be lost
laplace = cv2.Laplacian(cv_image, cv2.CV_64F)
cv2.imshow("Laplace_result", laplace)
cv2.imwrite("Laplace.jpg", laplace)
cv2.waitKey(10000)
Image 3.12 Original Image and it’s Laplacian derivative

Canny Edge Detection:

Just like other edge detection techniques we have come across Canny edge detection first removes noise and then detects edge. What’s new in this technique is the method Canny used for concluding if an edge is real-edge or just a set of noisy pixels. The threshold technique aims to make sure the unnecessary region is not included in the edge and the edge should be as this as possible. With reference to the gradient detection graph(image 3.9), the angle of gradient is arctan(dy/dx).

Image 3.13 Non- maximum suppression(Left) and Hysteresis Thresholding(Right). Image credit : Opencv-python-tutorials

Consider Image 3.13, while considering the direction of the gradient (as shown by the arrow through CAB), we need to make sure if the region of point C and point B should be included in the edge representation(make white in the binary image). For this refer the Hysteresis Thresholding, two shades of grey are used as an upper and lower limit of the threshold. If any pixel has a grey value higher than the upper limit(maxVal) then it is assigned one(white)and if lower than the lower limit(minVal) the pixel is assigned zero. If the intensity value in the pixel is in between the upper and lower limit then the concept of connectivity comes into the picture. Consider image 3.13(right), point C is connected to point A(sure edge), hence point C will be assigned one(white)and point B is not connected to any sure edge pixel will be given value zero(black).

Canny Edge Detection Code:

import cv2
import numpy as np
cv_image = cv2.imread("/home/rupali/tutorials/happy_kid.jpg", 0)
#canny function keeps in account the gaussian blurring and edge detection
canny_result = cv2.Canny(cv_image,100,200)
# 100 is lower limit of threshold and 200 is upper limit
cv2.imshow("Canny Result", canny_result)
cv2.imwrite("Canny.jpg", canny_result)
cv2.waitKey(1000)

Result:

Image 3.14 Original Image and It’s Canny Detection Result

Conclusion:

I admire your persistence to take all of the above in one go! I know it wasn’t easy but the efforts were worth what lies next in this series. Today, in a nutshell, we learnt how to enhance an image by removing noise and outlining the boundary of the image’s content. In the upcoming series, we will step up the level of complexity in computer vision. Stay tuned and Keep Learning!

--

--

Rupali Garewal
Analytics Vidhya

Passionate about robots- land, air, water. I build autonomous robots from scratch.