Image Basics with OpenCV

In this article, I am going to discuss Image Basics with OpenCV. Please read our previous article where we discussed Images and NumPy in Computer Vision.

Image Basics with OpenCV

OpenCV (Open-Source Computer Vision) is a popular library for computer vision applications, including image and video processing. In this article, we’ll explore the basics of image processing with OpenCV.

An image is a collection of pixels, where each pixel contains a value representing the color or grayscale intensity of that pixel. In OpenCV, images can be loaded from files or captured from a camera. Here’s an example of how to load an image using OpenCV:

import cv2

# Load an image from a file
img = cv2.imread('example_image.jpg')

In this example, we use the cv2.imread() function to load an image file. The function returns a NumPy array representing the image, which we can then manipulate and process using OpenCV functions. Once we have loaded an image, we can access its properties such as its width, height, and number of channels using the shape attribute of the NumPy array:

# Get the width, height, and number of channels of the image
height, width, channels = img.shape

In this example, we use the shape attribute of the NumPy array to get the width, height, and number of channels of the image. We can also display the image using the imshow() function:

# Display the image
cv2.imshow('example_image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, we use the cv2.imshow() function to display the image in a window with the title “Example Image”. The cv2.waitKey() function waits for a keyboard event before closing the window, and the cv2.destroyAllWindows() function closes all OpenCV windows.

We can perform a wide range of operations on images using OpenCV functions. For example, we can convert an image to grayscale using the cvtColor() function:

# Convert the image to grayscale
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

In this example, we use the cv2.cvtColor() function to convert the image from its default BGR color space to grayscale. We can also apply filters to an image, such as blurring or sharpening, using functions such as GaussianBlur() and filter2D():

# Apply a Gaussian blur filter to the image
blurred_img = cv2.GaussianBlur(img, (5, 5), 0)

# Apply a sharpening filter to the image
kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpened_img = cv2.filter2D(img, -1, kernel)

In these examples, we use the cv2.GaussianBlur() function to apply a Gaussian blur filter to the image, and the cv2.filter2D() function to apply a sharpening filter to the image.

In conclusion, OpenCV provides powerful tools for image processing and computer vision applications. By loading an image as a NumPy array and using OpenCV functions, we can manipulate and process the image in a wide range of ways, from basic operations such as color conversion and filtering to advanced computer vision techniques.

Drawing on Images

Drawing on images is a crucial aspect of image processing and computer vision. OpenCV provides several functions for drawing different shapes and text on images, such as lines, rectangles, circles, and polygons. In this article, we’ll explore how to draw on images using OpenCV in Python. Firstly, let’s load an image using OpenCV’s imread() function:

import cv2

# Load an image
img = cv2.imread('example_image.jpg')

In this example, we use the imread() function to load an image file named “example_image.jpg” and store it in the img variable. To draw on the image, we can use various OpenCV functions, such as line(), rectangle(), circle(), and putText(). Let’s look at some examples:

# Draw a line on the image
cv2.line(img, (0, 0), (100, 100), (0, 255, 0), 2)

# Draw a rectangle on the image
cv2.rectangle(img, (50, 50), (150, 150), (0, 0, 255), 2)

# Draw a circle on the image
cv2.circle(img, (200, 200), 50, (255, 0, 0), 2)

# Draw text on the image
cv2.putText(img, 'OpenCV Rocks!', (250, 250), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)

In this example, we draw a green line, a red rectangle, a blue circle, and white text on the image using the line(), rectangle(), circle(), and putText() functions, respectively. Each function takes specific parameters, such as the coordinates of the shape, the color of the shape, and the thickness of the shape’s border. After drawing on the image, we can display it using the imshow() function:

# Display the image
cv2.imshow('Drawn Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, we use the imshow() function to display the modified image in a window with the title “Drawn Image”. The waitKey() function waits for a keyboard event before closing the window, and the destroyAllWindows() function closes all OpenCV windows.

In conclusion, drawing on images using OpenCV is a simple process that involves loading an image and using various OpenCV functions to draw shapes and text on the image. By understanding how to draw on images, we can annotate images with additional information and create visualizations to aid in our computer vision tasks.

Direct Drawing on Images with a Mouse – Advanced Image Processing

Directly drawing on images using a mouse is an advanced technique in image processing that can be useful for a variety of applications, such as image annotation, object tracking, and object detection. In this article, we will explore how to directly draw on images using a mouse in Python with the OpenCV library. First, we need to import the necessary libraries:

import cv2
import numpy as np

Next, we will create a function called draw_circle that will draw circles on an image at the position of the mouse click:

def draw_circle(event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONUP:
        cv2.circle(img, (x, y), 20, (0, 0, 255), -1)

In this function, we check if the left mouse button was clicked (cv2.EVENT_LBUTTONUP). If it was, we draw a circle on the image at the mouse click position ((x, y)) with a radius of 20 and a red color (0, 0, 255). Now, we will create an image and set a window to display it:

img = np.zeros((512, 512, 3), np.uint8)
cv2.namedWindow('image')
cv2.setMouseCallback('image', draw_circle)

In this code, we create an image with a black background using the numpy library. We then create a window called ‘image’ and set the mouse callback function to draw_circle. Finally, we will display the image and wait for a keyboard event to exit the window:

while True:
    cv2.imshow('image', img)
    if cv2.waitKey(20) & 0xFF == 27:
        break
cv2.destroyAllWindows()

In this code, we use a while loop to continuously display the image and wait for a keyboard event. We use the imshow function to display the image and the waitKey function to wait for a keyboard event. We exit the loop and destroy all OpenCV windows if the ‘Esc’ key is pressed (0xFF == 27).

In conclusion, drawing on images directly with a mouse is an advanced technique that can be useful for a variety of applications in image processing. By understanding how to implement this technique using OpenCV in Python, we can annotate images with precision and ease, thereby enhancing our ability to perform complex image-processing tasks.

Color Mappings

Color mappings, also known as color spaces, are used in image processing to represent color information in a standardized way. There are several color mappings used in computer vision and image processing, each with its own advantages and disadvantages. In this article, we will explore some of the most common color mappings used in image processing.

RGB Color Space:

The RGB (Red, Green, Blue) color space is the most common color mapping used in computer vision and image processing. It represents colors by combining three primary colors (red, green, and blue) in varying intensities to create a range of colors. In the RGB color space, each pixel is represented by three values ranging from 0 to 255, representing the intensity of red, green, and blue respectively.

HSV Color Space:

HSV (Hue, Saturation, Value) color space is another popular color mapping used in image processing. It is a cylindrical representation of the RGB color space, which makes it more intuitive to understand and work with. In the HSV color space, hue represents the color itself, saturation represents the purity or intensity of the color, and the value represents the brightness of the color. This makes it easier to manipulate the color of an image without affecting its brightness.

Grayscale Color Space:

Grayscale is a single-channel color space that represents an image in shades of gray. In the grayscale color space, each pixel is represented by a single value ranging from 0 to 255, with 0 being black and 255 being white. Grayscale images are commonly used in image processing for edge detection and other operations that do not require color information.

YCrCb Color Space:

YCrCb (Luma, Chroma) color space is used for image and video compression. In this color space, Y represents the brightness of the image (luma), while Cr and Cb represent the color difference (chroma). This color space separates the brightness and color information of an image, making it easier to compress without losing quality.

Color mappings play a vital role in image processing, as they allow us to manipulate images in a standardized way. By understanding the different color mappings and their applications, we can better process and manipulate images to achieve our desired results.

Blending and Pasting Images

Blending and pasting images are common techniques used in image processing to combine multiple images into a single composite image. These techniques can be used to create artistic effects, remove unwanted objects from images, or enhance the visual appeal of an image. In this article, we will explore how to blend and paste images using Python and the OpenCV library.

Blending Images:

Blending images involves combining two or more images by assigning different weights to each image. The weighted sum of each pixel in the images is then calculated to produce the final image. The weights determine the amount of influence each image has on the final image. In OpenCV, blending can be achieved using the addWeighted function.

img1 = cv2.imread('image1.jpg')
img2 = cv2.imread('image2.jpg')
blended_img = cv2.addWeighted(img1, 0.5, img2, 0.5, 0)
cv2.imshow('Blended Image', blended_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this code, we load two images (img1 and img2) and blend them together using the addWeighted function. The resulting image (blended_img) is displayed using the imshow function.

Pasting Images:

Pasting images involves inserting an image into another image at a specific location. In OpenCV, this can be achieved using the ROI (region of interest) function.

img1 = cv2.imread('image1.jpg')
img2 = cv2.imread('image2.jpg')
rows, cols, channels = img2.shape
roi = img1[0:rows, 0:cols]
img2gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 10, 255, cv2.THRESH_BINARY)
mask_inv = cv2.bitwise_not(mask)
img1_bg = cv2.bitwise_and(roi, roi, mask=mask_inv)
img2_fg = cv2.bitwise_and(img2, img2, mask=mask)
dst = cv2.add(img1_bg, img2_fg)
img1[0:rows, 0:cols] = dst
cv2.imshow('Pasted Image', img1)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this code, we load two images (img1 and img2) and insert img2 into img1 at the top-left corner. We first extract a region of interest (ROI) from img1 that is the same size as img2. We then convert img2 to grayscale and create a binary mask using the threshold function. We then invert the mask using the bitwise_not function and extract the background of img1 using the bitwise_and function. We also extract the foreground of img2 using the bitwise_and function. We then combine the background and foreground using the add function and paste the resulting image onto img1.

In conclusion, blending and pasting images are common techniques used in image processing to combine multiple images into a single composite image. These techniques can be used to create artistic effects, remove unwanted objects from images, or enhance the visual appeal of an image. By understanding how to blend and paste images using Python and the OpenCV library, we can better manipulate and enhance images to achieve our desired results.

Advanced Image Thresholding

Thresholding is a fundamental operation in image processing that involves the separation of an image into two classes based on a threshold value. It is commonly used to segment images, i.e., to identify regions of interest within an image. Advanced image thresholding techniques use more complex algorithms to improve the accuracy and reliability of image segmentation.

In this article, we will discuss some advanced image thresholding techniques that are commonly used in computer vision and image processing applications.

Otsu’s Method

Otsu’s method is a widely used thresholding technique that is based on the idea of maximizing the between-class variance of the image. The algorithm works by computing the histogram of the image, and then iterating through all possible threshold values to find the one that maximizes the between-class variance. The threshold value that maximizes the between-class variance is then used to segment the image.

One of the advantages of Otsu’s method is that it automatically determines the threshold value, making it useful in applications where the threshold value is not known a priori. Additionally, Otsu’s method is relatively fast and can be used with both grayscale and color images.

Adaptive Thresholding

Adaptive thresholding is a technique that adjusts the threshold value based on the local characteristics of the image. This is particularly useful in cases where the illumination conditions of the image are not uniform, as it allows for better segmentation of the image.

The algorithm works by dividing the image into small regions, and then computing a threshold value for each region based on its local statistics. The threshold value for each region is then used to segment the image.

Adaptive thresholding is useful in applications where the illumination conditions of the image vary, such as in microscopy or surveillance applications.

Mean Shift Thresholding

Mean shift thresholding is a technique that is based on the mean shift algorithm, which is commonly used in computer vision applications for image segmentation and object tracking.

The algorithm works by iteratively shifting the mean of the image’s intensity distribution until a stable threshold value is obtained. The threshold value is then used to segment the image.

One of the advantages of mean shift thresholding is that it is robust to noise and can handle non-uniform illumination conditions. Additionally, mean shift thresholding can be used with both grayscale and color images.

Conclusion –

Advanced image thresholding techniques have become increasingly important in computer vision and image processing applications. Otsu’s method, adaptive thresholding, and mean shift thresholding are just a few examples of the many advanced image thresholding techniques that are available.

When selecting an advanced thresholding technique for a particular application, it is important to consider factors such as the illumination conditions of the image, the complexity of the image, and the desired level of accuracy. With the right choice of thresholding technique, it is possible to achieve highly accurate and reliable image segmentation for a wide range of applications.

In the next article, I am going to discuss Introduction to Video Basics. Here, in this article, I try to explain Image Basics with OpenCV. I hope you enjoy this Image Basics with OpenCV article. Please post your feedback, suggestions, and questions about this article.

Dot Net Tutorials

About the Author: Pranaya Rout

Pranaya Rout has published more than 3,000 articles in his 11-year career. Pranaya Rout has very good experience with Microsoft Technologies, Including C#, VB, ASP.NET MVC, ASP.NET Web API, EF, EF Core, ADO.NET, LINQ, SQL Server, MYSQL, Oracle, ASP.NET Core, Cloud Computing, Microservices, Design Patterns and still learning new technologies.