Python
Follow me on
Sumit Khanna for more updates
OpenCV CheatSheet & Comprehensive
Guide
OpenCV (Open Source Computer Vision Library) is an open-source computer vision and
machine learning software library. OpenCV was built to provide a common infrastructure for
computer vision applications and to accelerate the use of machine perception in commercial
products. The library contains more than 2500 optimized algorithms, including a
comprehensive set of both classic and state-of-the-art computer vision and machine learning
algorithms. These algorithms can be used for various purposes such as detecting and
recognizing faces, identifying objects, classifying human actions in videos, tracking camera
movements, tracking moving objects, extracting 3D models of objects, stitching images
together to produce a high-resolution image of an entire scene, finding similar images from an
image database, removing red-eye, following eye movements, and much more.
Cheat Sheet Table
Method Name Definition
cv2.imread Reads an image from a file.
cv2.imwrite Writes an image to a file.
cv2.imshow Displays an image in a window.
cv2.cvtColor Converts an image from one color space to
another.
cv2.GaussianBlur Applies a Gaussian blur to an image.
cv2.Canny Applies the Canny edge detector to an image.
Detects lines in a binary image using the Hough
cv2.HoughLines
Transform.
Detects circles in a binary image using the
cv2.HoughCircles
Hough Transform.
Method Name Definition
cv2.findContours Finds contours in a binary image.
cv2.drawContours Draws contours on an image.
cv2.rectangle Draws a rectangle on an image.
cv2.circle Draws a circle on an image.
cv2.line Draws a line on an image.
cv2.putText Puts text on an image.
cv2.resize Resizes an image to the specified dimensions.
cv2.warpAffine Applies an affine transformation to an image.
cv2.warpPerspective Applies a perspective transformation to an
image.
Computes the affine matrix for rotating an image
cv2.getRotationMatrix2D
by a specified angle.
Computes the affine transformation matrix from
cv2.getAffineTransform
three pairs of points.
Computes the perspective transformation
cv2.getPerspectiveTransform
matrix from four pairs of points.
cv2.dilate Applies the dilation operation to an image.
cv2.erode Applies the erosion operation to an image.
Applies advanced morphological
cv2.morphologyEx
transformations to an image.
cv2.threshold Applies a fixed-level threshold to an image.
cv2.adaptiveThreshold Applies an adaptive threshold to an image.
cv2.equalizeHist Equalizes the histogram of a grayscale image.
cv2.calcHist Computes the histogram of an image.
cv2.compareHist Compares two histograms.
Compares a template image with a source
cv2.matchTemplate
image using a specific method.
cv2.VideoCapture Opens a video file or a capturing device.
cv2.VideoWriter Writes video frames to a video file.
Method Name Definition
Computes dense optical flow using the Lucas-
cv2.calcOpticalFlowPyrLK
Kanade method.
Computes dense optical flow using the
cv2.calcOpticalFlowFarneback
Farneback method.
cv2.dnn.readNet Reads a deep learning network model from a
file.
Converts an image to a blob for input into a deep
cv2.dnn.blobFromImage
learning network.
cv2.dnn_Net.forward Runs a forward pass of the deep learning
network.
cv2.CascadeClassifier Detects objects using a cascade classifier.
cv2.face.LBPHFaceRecognizer_create Creates an LBPH face recognizer.
cv2.face.FisherFaceRecognizer_create Creates a Fisher face recognizer.
cv2.face.EigenFaceRecognizer_create Creates an Eigen face recognizer.
cv2.bgsegm.createBackgroundSubtractorMOG Creates a MOG background subtractor.
cv2.bgsegm.createBackgroundSubtractorMOG Creates a MOG2 background subtractor.
2
cv2.bgsegm.createBackgroundSubtractorKNN Creates a KNN background subtractor.
cv2.bgsegm.createBackgroundSubtractorGMG Creates a GMG background subtractor.
cv2.FastFeatureDetector_create Creates a FAST feature detector.
Creates an ORB feature detector and descriptor
cv2.ORB_create
extractor.
Creates a SIFT feature detector and
cv2.SIFT_create
descriptor extractor.
Creates a SURF feature detector and descriptor
cv2.SURF_create
extractor.
Creates a BRISK feature detector and descriptor
cv2.BRISK_create
extractor.
cv2.drawKeypoints Draws keypoints on an image.
Explanation and Usage
1. cv2.imrea
d
Reads an image from a file.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Display the image
cv2.imshow('Image',
image)
cv2.waitKey(0)
cv2.destroyAllWindows()
2. cv2.imwrit
e
Writes an image to a file.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Write the image to a new file
cv2.imwrite('path/to/new_image.jpg',
image)
3. cv2.imsho
w
Displays an image in a window.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Display the image
cv2.imshow('Image',
image)
cv2.waitKey(0)
cv2.destroyAllWindows()
4. cv2.cvtColo
r
Converts an image from one color space to another.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Convert the image to grayscale
gray_image = cv2.cvtColor(image,
cv2.COLOR_BGR2GRAY)
# Display the grayscale image
cv2.imshow('Gray Image',
gray_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
5. cv2.GaussianBlu
r
Applies a Gaussian blur to an image.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Apply Gaussian blur
blurred_image = cv2.GaussianBlur(image, (15,
15), 0)
# Display the blurred image
cv2.imshow('Blurred Image',
blurred_image) cv2.waitKey(0)
cv2.destroyAllWindows()
6. cv2.Cann
y
Applies the Canny edge detector to an image.
Usage:
import
cv2
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Apply Canny edge detector
edges = cv2.Canny(image, 100,
200)
# Display the edges
cv2.imshow('Edges',
edges) cv2.waitKey(0)
cv2.destroyAllWindows()
7. cv2.HoughLine
s
Detects lines in a binary image using the Hough Transform.
Usage:
import cv2
import numpy as np
# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)
# Apply Canny edge detector
edges = cv2.Canny(image, 50, 150, apertureSize=3)
# Detect lines using Hough Transform
lines = cv2.HoughLines(edges, 1, np.pi/180, 200)
# Draw the lines on the
image for line in lines:
rho, theta = line[0]
a = np.cos(theta)
b = np.sin(theta)
x0 = a * rho
y0 = b * rho
x1 = int(x0 + 1000 * (-
b)) x2 = int(x0 - 1000 *
(-b)) y1 = int(y0 + 1000
* (a)) y2 = int(y0 -
1000 * (a))
cv2.line(image, (x1, y1), (x2,
y2), (0, 0, 255), 2)
# Display the result
cv2.imshow('Hough Lines', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
8. cv2.HoughCircle
s
Detects circles in a binary image using the Hough Transform.
Usage:
import cv2
import numpy as
np
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Apply Gaussian blur
blurred_image = cv2.GaussianBlur(image, (9,
9), 2)
# Detect circles using Hough Transform
circles = cv2.HoughCircles(blurred_image, cv2.HOUGH_GRADIENT, 1,
20,
param1=50, param2=30, minRadius=0,
# Convert the circles to integers
circles =
np.uint16(np.around(circles))
# Draw the circles on the image
for i in circles[0, :]:
cv2.circle(image, (i[0], i[1]), i[2], (0, 255,
0), 2)
cv2.circle(image, (i[0], i[1]), 2, (0, 0, 255),
# Display the result
cv2.imshow('Hough Circles',
image)
cv2.waitKey(0)
cv2.destroyAllWindows()
9. cv2.findContour
s
Finds contours in a binary image.
Usage:
import
cv2
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Apply threshold to get a binary image
_, binary_image = cv2.threshold(image, 127, 255,
cv2.THRESH_BINARY)
# Find contours
contours, hierarchy = cv2.findContours(binary_image, cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
# Draw the contours on the image
cv2.drawContours(image, contours, -1, (0, 255,
0), 3)
# Display the result
cv2.imshow('Contours',
image) cv2.waitKey(0)
cv2.destroyAllWindows()
10. cv2.drawContour
s
Draws contours on an image.
Usage:
import
cv2
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Apply threshold to get a binary image
_, binary_image = cv2.threshold(image, 127, 255,
cv2.THRESH_BINARY)
# Find contours
contours, hierarchy = cv2.findContours(binary_image, cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
# Draw the contours on the image
cv2.drawContours(image, contours, -1, (0, 255,
0), 3)
# Display the result
cv2.imshow('Drawn Contours',
image)
cv2.waitKey(0)
cv2.destroyAllWindows()
11. cv2.rectangl
e
Draws a rectangle on an image.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Draw a rectangle on the image
cv2.rectangle(image, (50, 50), (200, 200), (255, 0,
0), 3)
# Display the result
cv2.imshow('Rectangle',
image)
cv2.waitKey(0)
cv2.destroyAllWindows()
12. cv2.circl
e
Draws a circle on an image.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Draw a circle on the image
cv2.circle(image, (150, 150), 50, (0, 255,
0), 3)
# Display the result
cv2.imshow('Circle',
image) cv2.waitKey(0)
cv2.destroyAllWindows()
13. cv2.lin
e
Draws a line on an image.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Draw a line on the image
cv2.line(image, (100, 100), (300, 300), (0, 0,
255), 3)
# Display the result
cv2.imshow('Line',
image)
cv2.waitKey(0)
cv2.destroyAllWindows()
14. cv2.putTex
t
Puts text on an image.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Put text on the image
cv2.putText(image, 'OpenCV', (50, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2,
cv2.LINE_A
# Display the result
cv2.imshow('Text',
image) cv2.waitKey(0)
cv2.destroyAllWindows()
15. cv2.resiz
e
Resizes an image to the specified dimensions.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Resize the image
resized_image = cv2.resize(image, (400,
300))
# Display the result
cv2.imshow('Resized Image',
resized_image) cv2.waitKey(0)
cv2.destroyAllWindows()
16. cv2.warpAffin
e
Applies an affine transformation to an image.
Usage:
import cv2
import numpy as
np
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Define the transformation matrix
M = np.float32([[1, 0, 100], [0, 1,
50]])
# Apply affine transformation
transformed_image = cv2.warpAffine(image, M, (image.shape[1],
image.shape[0]))
# Display the result
cv2.imshow('Affine Transformation',
transformed_image) cv2.waitKey(0)
cv2.destroyAllWindows()
17. cv2.warpPerspecti
ve
Applies a perspective transformation to an image.
Usage:
import cv2
import numpy as
np
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Define the points for perspective transformation
pts1 = np.float32([[50, 50], [200, 50], [50, 200], [200,
200]])
pts2 = np.float32([[10, 100], [200, 50], [100, 250], [300,
# Compute the perspective transformation
matrix
M = cv2.getPerspectiveTransform(pts1, pts2)
# Apply perspective transformation
transformed_image = cv2.warpPerspective(image, M, (image.shape[1],
image.shape[0]))
# Display the result
cv2.imshow('Perspective Transformation',
transformed_image) cv2.waitKey(0)
cv2.destroyAllWindows()
18. cv2.getRotationMatrix
2D
Computes the affine matrix for rotating an image by a specified angle.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Compute the rotation matrix
M = cv2.getRotationMatrix2D((image.shape[1]/2, image.shape[0]/2),
45, 1)
# Apply rotation
rotated_image = cv2.warpAffine(image, M, (image.shape[1],
image.shape[0]))
# Display the result
cv2.imshow('Rotated Image',
rotated_image) cv2.waitKey(0)
cv2.destroyAllWindows()
19. cv2.getAffineTransfo
rm
Computes the affine transformation matrix from three pairs of points.
Usage:
import cv2
import numpy as
np
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Define the points for affine transformation
pts1 = np.float32([[50, 50], [200, 50], [50,
200]])
pts2 = np.float32([[10, 100], [200, 50], [100,
# Compute the affine transformation
matrix
M = cv2.getAffineTransform(pts1, pts2)
# Apply affine transformation
transformed_image = cv2.warpAffine(image, M, (image.shape[1],
image.shape[0]))
# Display the result
cv2.imshow('Affine Transformation',
transformed_image) cv2.waitKey(0)
cv2.destroyAllWindows()
20. cv2.getPerspectiveTransfo
rm
Computes the perspective transformation matrix from four pairs of points.
Usage:
import cv2
import numpy as
np
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Define the points for perspective transformation
pts1 = np.float32([[50, 50], [200, 50], [50, 200], [200,
200]])
pts2 = np.float32([[10, 100], [200, 50], [100, 250], [300,
# Compute the perspective transformation
matrix
M = cv2.getPerspectiveTransform(pts1, pts2)
# Apply perspective
transformation
transformed_image = cv2
.warpPerspective(image, M, (image.shape[1],
image.shape[0]))
# Display the result
cv2.imshow('Perspective Transformation',
transformed_image) cv2.waitKey(0)
cv2.destroyAllWindows()
21. cv2.dilat
e
Applies the dilation operation to an image.
Usage:
import cv2
import numpy as
np
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Define the kernel
kernel = np.ones((5, 5),
np.uint8)
# Apply dilation
dilated_image = cv2.dilate(image, kernel,
iterations=1)
# Display the result
cv2.imshow('Dilated Image',
dilated_image) cv2.waitKey(0)
cv2.destroyAllWindows()
22. cv2.erod
e
Applies the erosion operation to an image.
Usage:
import cv2
import numpy as
np
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Define the kernel
kernel = np.ones((5, 5),
np.uint8)
# Apply erosion
eroded_image = cv2.erode(image, kernel,
iterations=1)
# Display the result
cv2.imshow('Eroded Image',
eroded_image) cv2.waitKey(0)
cv2.destroyAllWindows()
23. cv2.morphologyE
x
Applies advanced morphological transformations to an image.
Usage:
import cv2
import numpy as
np
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Define the kernel
kernel = np.ones((5, 5),
np.uint8)
# Apply morphological transformations
morph_image = cv2.morphologyEx(image, cv2.MORPH_OPEN,
kernel)
# Display the result
cv2.imshow('Morphological Transformations',
morph_image) cv2.waitKey(0)
cv2.destroyAllWindows()
24. cv2.threshol
d
Applies a fixed-level threshold to an image.
Usage:
import
cv2
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Apply threshold
_, thresh_image = cv2.threshold(image, 127, 255,
cv2.THRESH_BINARY)
# Display the result
cv2.imshow('Threshold Image',
thresh_image) cv2.waitKey(0)
cv2.destroyAllWindows()
25. cv2.adaptiveThresho
ld
Applies an adaptive threshold to an image.
Usage:
import
cv2
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Apply adaptive threshold
adaptive_thresh_image = cv2.adaptiveThreshold(image, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 11, 2)
# Display the result
cv2.imshow('Adaptive Threshold Image',
adaptive_thresh_image) cv2.waitKey(0)
cv2.destroyAllWindows()
26. cv2.equalizeHis
t
Equalizes the histogram of a grayscale image.
Usage:
import
cv2
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Equalize histogram
equalized_image =
cv2.equalizeHist(image)
# Display the result
cv2.imshow('Equalized Image',
equalized_image) cv2.waitKey(0)
cv2.destroyAllWindows()
27. cv2.calcHis
t
Computes the histogram of an image.
Usage:
import cv2
import matplotlib.pyplot as
plt
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Compute histogram
hist = cv2.calcHist([image], [0], None, [256], [0,
256])
# Plot
histogram
plt.plot(hist)
plt.show()
28. cv2.compareHis
t
Compares two histograms.
Usage:
import
cv2
# Read two images
image1 = cv2.imread('path/to/image1.jpg',
cv2.IMREAD_GRAYSCALE) image2 =
cv2.imread('path/to/image2.jpg', cv2.IMREAD_GRAYSCALE)
# Compute histograms
hist1 = cv2.calcHist([image1], [0], None, [256], [0,
256])
hist2 = cv2.calcHist([image2], [0], None, [256], [0,
# Compare histograms
comparison = cv2.compareHist(hist1, hist2,
cv2.HISTCMP_CORREL)
print('Histogram Comparison Result:',
comparison)
29. cv2.matchTemplat
e
Compares a template image with a source image using a specific method.
Usage:
import
cv2
# Read the source image
source_image =
cv2.imread('path/to/source_image.jpg')
# Read the template image
template_image =
cv2.imread('path/to/template_image.jpg')
# Perform template matching
result = cv2.matchTemplate(source_image, template_image,
cv2.TM_CCOEFF_NORMED)
# Get the location of the best match
min_val, max_val, min_loc, max_loc =
cv2.minMaxLoc(result)
# Draw a rectangle around the matched region
top_left = max_loc
bottom_right = (top_left[0] + template_image.shape[1], top_left[1] +
template_image.shape[0]) cv2.rectangle(source_image, top_left, bottom_right, (0, 255,
0), 2)
# Display the result
cv2.imshow('Matched Template',
source_image) cv2.waitKey(0)
cv2.destroyAllWindows()
30. cv2.VideoCaptur
e
Opens a video file or a capturing device.
Usage:
import
cv2
# Open a video file
cap =
cv2.VideoCapture('path/to/video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
cap.release()
cv2.destroyAllWindows
()
31. cv2.VideoWrite
r
Writes video frames to a video file.
Usage:
import
cv2
# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi', fourcc, 20.0, (640,
480))
# Open a video file
cap =
cv2.VideoCapture('path/to/video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
out.write(frame)
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
cap.release()
out.release()
cv2.destroyAllWindows
32. cv2.calcOpticalFlowPyr
LK
Computes dense optical flow using the Lucas-Kanade method.
Usage:
import cv2
import numpy as np
# Read the first frame
cap =
cv2.VideoCapture('path/to/video.mp4') ret,
old_frame = cap.read()
old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)
# Parameters for Lucas-Kanade optical flow
lk_params = dict(winSize=(15, 15),
maxLevel=2,
criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))
# Detect corners to track
feature_params = dict(maxCorners=100, qualityLevel=0.3, minDistance=7,
blockSize=7) p0 = cv2.goodFeaturesToTrack(old_gray, mask=None,
**feature_params)
# Create a mask image for drawing
purposes mask =
np.zeros_like(old_frame)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Calculate optical flow
p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)
# Select good points
good_new = p1[st == 1]
good_old = p0[st == 1]
# Draw the tracks
for i, (new, old) in enumerate(zip(good_new,
good_old)): a, b = new.ravel()
c, d = old.ravel()
mask = cv2.line(mask, (a, b), (c, d), (0, 255, 0), 2)
frame = cv2.circle(frame, (a, b), 5, (0, 255, 0),
-1) img = cv2.add(frame, mask)
cv2.imshow('Optical Flow', img)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
# Update the previous frame and previous
points old_gray = frame_gray.copy()
p0 = good
_new.reshape(-1, 1,
2)
cap.release()
cv2.destroyAllWindows
()
33. cv2.calcOpticalFlowFarneba
ck
Computes dense optical flow using the Farneback method.
Usage:
import cv2
import numpy as np
# Open a video file
cap = cv2.VideoCapture('path/to/video.mp4')
ret, frame1 = cap.read()
prvs = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
while cap.isOpened():
ret, frame2 =
cap.read() if not ret:
break
next = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)
# Calculate optical flow
flow = cv2.calcOpticalFlowFarneback(prvs, next, None, 0.5, 3, 15, 3, 5, 1.2, 0)
# Convert the flow to HSV color
space hsv = np.zeros_like(frame1)
hsv[..., 1] = 255
mag, ang = cv2.cartToPolar(flow[..., 0], flow[...,
1]) hsv[..., 0] = ang * 180 / np.pi / 2
hsv[..., 2] = cv2.normalize(mag, None, 0, 255,
cv2.NORM_MINMAX) rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
cv2.imshow('Optical Flow', rgb)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
prvs = next
cap.release()
cv2.destroyAllWindows()
34. cv2.dnn.readNe
t
Reads a deep learning network model from a file.
Usage:
import
cv2
# Load a pre-trained model
net = cv2.dnn.readNet('path/to/model.caffemodel',
'path/to/model.prototxt')
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Prepare the image for the model
blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(224, 224), mean=(104,
117, 123))
# Set the input to the
model
net.setInput(blob)
# Perform forward
pass
output =
print(output
)
35. cv2.dnn.blobFromIma
ge
Converts an image to a blob for input into a deep learning network.
Usage:
import
cv2
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Convert the image to a blob
blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(224, 224), mean=(104,
117, 123))
print(blob.shape
)
36. cv2.dnn_Net.forwa
rd
Runs a forward pass of the deep learning network.
Usage:
import
cv2
# Load a pre-trained model
net = cv2.dnn.readNet('path/to/model.caffemodel',
'path/to/model.prototxt')
# Read an image
image =
cv2.imread('path/to/image.jpg')
# Prepare the image for the model
blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(224, 224), mean=(104,
117, 123))
# Set the input to the
model
net.setInput(blob)
# Perform forward
pass
output =
print(output
)
37. cv2.CascadeClassifi
er
Detects objects using a cascade classifier.
Usage:
import
cv2
# Load the cascade classifier
face_cascade =
cv2.CascadeClassifier('path/to/haarcascade_frontalface_default.xml')
# Read an image
image = cv2.imread('path/to/image.jpg')
gray_image = cv2.cvtColor(image,
cv2.COLOR_BGR2GRAY)
# Detect faces
faces = face_cascade.detectMultiScale(gray_image, scaleFactor=1.1,
minNeighbors=5)
# Draw rectangles around the faces
for (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0,
0), 2)
# Display the result
cv2.imshow('Detected Faces',
image) cv2.waitKey(0)
cv2.destroyAllWindows()
38. cv2.face.LBPHFaceRecognizer_cre
ate
Creates an LBPH face recognizer.
Usage:
import
cv2
# Create an LBPH face recognizer
recognizer =
cv2.face.LBPHFaceRecognizer_create()
# Train the recognizer with training data
# recognizer.train(training_images,
training_labels)
# Save the trained model
#
recognizer.save('lbph_model.yml')
# Load the trained model
recognizer.read('lbph_model.yml
')
# Recognize faces
# label, confidence =
recognizer.predict(test_image)
39. cv2.face.FisherFaceRecognizer_cre
ate
Creates a Fisher face recognizer.
Usage:
import
cv2
# Create a Fisher face recognizer
recognizer =
cv2.face.FisherFaceRecognizer_create()
# Train the recognizer with training data
# recognizer.train(training_images,
training_labels)
# Save the trained model
#
recognizer.save('fisher_model.yml')
# Load the trained model
recognizer.read('fisher_model.yml
')
# Recognize faces
# label, confidence =
recognizer.predict(test_image)
40. cv2.face.EigenFaceRecognizer_cre
ate
Creates an Eigen face recognizer.
Usage:
import
cv2
# Create an Eigen face recognizer
recognizer =
cv2.face.EigenFaceRecognizer_create()
# Train the recognizer with training data
# recognizer.train(training_images,
training_labels)
# Save the trained model
#
recognizer.save('eigen_model.yml')
# Load the trained model
recognizer.read('eigen_model.yml
')
# Recognize faces
# label, confidence =
recognizer.predict(test_image)
41. cv2.bgsegm.createBackgroundSubtractor
MOG
Creates a MOG background subtractor.
Usage:
import
cv2
# Create a MOG background subtractor
background_subtractor =
cv2.bgsegm.createBackgroundSubtractorMOG()
# Open a video file
cap =
cv2.VideoCapture('path/to/video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
fg_mask =
background_subtractor.apply(frame)
cv2.imshow('Foreground Mask',
fg_mask)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
cap.release()
cv2.destroyAllWindows
()
42. cv2.bgsegm.createBackgroundSubtractorM
OG2
Creates a MOG2 background subtractor.
Usage:
import
cv2
# Create a MOG2 background subtractor
background_subtractor =
cv2.createBackgroundSubtractorMOG2()
# Open a video file
cap =
cv2.VideoCapture('path/to/video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
fg_mask =
background_subtractor.apply(frame)
cv2.imshow('Foreground Mask',
fg_mask)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
cap.release()
cv2.destroyAllWindows
()
43. cv2.bgsegm.createBackgroundSubtractor
KNN
Creates a KNN background subtractor.
Usage:
import
cv2
# Create a KNN background subtractor
background_subtractor =
cv2.createBackgroundSubtractorKNN()
# Open a video file
cap =
cv2.VideoCapture('path/to/video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
fg_mask =
background_subtractor.apply(frame)
cv2.imshow('Foreground Mask',
fg_mask)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
cap.release()
cv2.destroyAllWindows
()
44. cv2.bgsegm.createBackgroundSubtractor
GMG
Creates a GMG background subtractor.
Usage:
import
cv2
# Create a GMG background subtractor
background_subtractor =
cv2.bgsegm.createBackgroundSubtractorGMG()
# Open a video file
cap =
cv2.VideoCapture('path/to/video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
fg_mask =
background_subtractor.apply(frame)
cv2.imshow('Foreground Mask',
fg_mask)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
cap.release()
cv2.destroyAllWindows
()
45. cv2.FastFeatureDetector_crea
te
Creates a FAST feature detector.
Usage:
import
cv2
# Create a FAST feature detector
fast =
cv2.FastFeatureDetector_create()
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Detect keypoints
keypoints = fast.detect(image,
None)
# Draw keypoints
image_with_keypoints =
cv
2.drawKeypoints(image, keypoints, None, color=(255,
0, 0))
# Display the result
cv2.imshow('FAST Keypoints',
image_with_keypoints) cv2.waitKey(0)
cv2.destroyAllWindows()
46. cv2.ORB_creat
e
Creates an ORB feature detector and descriptor extractor.
Usage:
import
cv2
# Create an ORB
detector
orb = cv2.ORB_create()
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Detect keypoints and descriptors
keypoints, descriptors = orb.detectAndCompute(image,
None)
# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255,
0, 0))
# Display the result
cv2.imshow('ORB Keypoints',
image_with_keypoints) cv2.waitKey(0)
cv2.destroyAllWindows()
47. cv2.SIFT_creat
e
Creates a SIFT feature detector and descriptor extractor.
Usage:
import
cv2
# Create a SIFT
detector
sift =
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Detect keypoints and descriptors
keypoints, descriptors = sift.detectAndCompute(image,
None)
# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255,
0, 0))
# Display the result
cv2.imshow('SIFT Keypoints',
image_with_keypoints) cv2.waitKey(0)
cv2.destroyAllWindows()
48. cv2.SURF_creat
e
Creates a SURF feature detector and descriptor extractor.
Usage:
import
cv2
# Create a SURF
detector
surf =
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Detect keypoints and descriptors
keypoints, descriptors = surf.detectAndCompute(image,
None)
# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255,
0, 0))
# Display the result
cv2.imshow('SURF Keypoints',
image_with_keypoints) cv2.waitKey(0)
cv2.destroyAllWindows()
49. cv2.BRISK_creat
e
Creates a BRISK feature detector and descriptor extractor.
Usage:
import
cv2
# Create a BRISK
detector
brisk =
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Detect keypoints and descriptors
keypoints, descriptors = brisk.detectAndCompute(image,
None)
# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255,
0, 0))
# Display the result
cv2.imshow('BRISK Keypoints',
image_with_keypoints) cv2.waitKey(0)
cv2.destroyAllWindows()
50. cv2.drawKeypoint
s
Draws keypoints on an image.
Usage:
import
cv2
# Read an image
image = cv2.imread('path/to/image.jpg',
cv2.IMREAD_GRAYSCALE)
# Create a feature
detector
orb = cv2.ORB_create()
# Detect keypoints
keypoints = orb.detect(image,
None)
# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255,
0, 0))
# Display the result
cv2.imshow('Keypoints',
image_with_keypoints) cv2.waitKey(0)
cv2.destroyAllWindows()
Mini Computer Vision Project
Use Case 1: Object Detection using YOLO
import cv2
# Load YOLO
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
# Load image
img = cv2.imread("image.jpg")
height, width, channels =
img.shape
# Detecting objects
blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True,
crop=False) net.setInput(blob)
outs = net.forward(output_layers)
# Showing information on the
screen class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores =
detection[5:]
class_id =
np.argmax(scores)
confidence =
scores[class_id] if
confidence > 0.5:
# Object detected
center_x = int(detection[0] *
width) center_y = int(detection[1]
* height) w = int(detection[2] *
width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w /
2) y = int(center_y - h
/ 2)
boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
font = cv2.FONT_HERSHEY_PLAIN
for i in
range(len(boxes)): if
i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
color = colors[i]
cv2.rectangle(img, (x, y), (x + w, y + h), color,
2)
cv2.imshow("Image",
img)
cv2.waitKey(0)
cv2.destroyAllWindows(
Use Case 2: Semantic Segmentation using DeepLabV3
import cv2
# Load DeepLabV3 model
net = cv2.dnn.readNetFromTensorflow("deeplabv3.pb")
# Load image
image = cv2.imread("image.jpg")
# Prepare the image
blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(513, 513), mean=(104.00698793,
116.6687
# Set the input
net.setInput(blob)
# Perform forward
pass output =
net.forward()
# Process the output
output = output.squeeze().argmax(axis=0)
output = cv2.resize(output, (image.shape[1], image.shape[0]))
# Apply color map
output_colored = cv2.applyColorMap(output.astype(np.uint8), cv2.COLORMAP_JET)
# Display the result
cv2.imshow("Semantic Segmentation", output_colored)
cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 3: Gesture Recognition using MediaPipe
import cv2
import mediapipe as mp
mp_hands = mp.solutions.hands
mp_drawing =
mp.solutions.drawing_utils
# Initialize MediaPipe Hands
hands = mp_hands.Hands(static_image_mode=False, max_num_hands=2,
min_detection_confidence=0.5, min
# Open video capture
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to RGB
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Process the frame
result = hands.process(rgb_frame)
# Draw hand landmarks
if result.multi_hand_landmarks:
for hand_landmarks in result.multi_hand_landmarks:
mp_drawing.draw_landmarks(frame, hand_landmarks,
mp_hands.HAND_CONNECTIONS)
# Display the result
cv2.imshow('Hand Gesture Recognition',
frame) if cv2.waitKey(1) & 0xFF ==
ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Use Case 4: Image Recognition using InceptionV3
import
cv2
# Load InceptionV3 model
net =
cv2.dnn.readNetFromTensorflow("inceptionv3.pb")
# Load image
image =
cv2.imread("image.jpg")
# Prepare the image
blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(299, 299), mean=(104,
117, 123))
# Set the input
net.setInput(blob
)
# Perform forward
pass
output =
# Get the predicted
class
class_id =
# Display the result
print("Predicted Class ID:",
class_id)
Use Case 5: Face Detection using Haar Cascades
import cv2
# Load the cascade classifier
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# Open video capture
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
Convert the frame to grayscale
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)
# Draw rectangles around the
faces for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
# Display the result
cv2.imshow('Face Detection',
frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Use Case 6: Real-time Object Tracking using GOTURN Tracker
import cv2
# Load the GOTURN tracker
tracker = cv2.TrackerGOTURN_create()
# Open video capture
cap = cv2.VideoCapture(0)
ret, frame = cap.read()
# Define the initial bounding
box bbox = cv2.selectROI(frame,
False)
# Initialize the tracker
tracker.init(frame, bbox)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Update the tracker
success, bbox = tracker.update(frame)
# Draw the bounding
box if success:
p1 = (int(bbox[0]), int(bbox[1]))
p2 = (int(bbox[0] + bbox[2]), int(bbox[1] +
bbox[3])) cv2.rectangle(frame, p1, p2, (0, 255,
0), 2, 1)
else:
cv2.putText(frame, "Tracking failure detected", (100, 80),
cv2.FONT_HERSHEY_SIMPLEX, 0.75,
# Display the result
cv2.imshow('Object Tracking',
frame) if cv2.waitKey(1) & 0xFF ==
ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Use Case 7: Background Subtraction using KNN
import
cv2
# Create a KNN background subtractor
background_subtractor =
cv2.createBackgroundSubtractorKNN()
# Open video capture
cap =
cv2.VideoCapture(0)
while cap.isOpened():
ret, frame =
cap.read() if not
ret:
# Apply background subtraction
fg_mask =
background_subtractor.apply(frame)
# Display the result
cv2.imshow('Background Subtraction',
fg_mask) if cv2.waitKey(1) & 0xFF ==
ord('q'):
cap.release()
cv2.destroyAllWindows
()
Use Case 8: Lane Detection in a Video
import cv2
import numpy as np
def region_of_interest(img, vertices):
mask = np.zeros_like(img)
cv2.fillPoly(mask, vertices, 255)
masked = cv2.bitwise_and(img,
mask) return masked
def draw_lines(img, lines):
for line in lines:
for x1, y1, x2, y2 in line:
cv2.line(img, (x1, y1), (x2, y2), (0, 255, 0), 3)
# Open video capture
cap = cv2.VideoCapture('path/to/video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Apply Gaussian blur
blur = cv2.GaussianBlur(gray, (5, 5), 0)
# Apply Canny edge detector
edges = cv2.Canny(blur, 50,
150)
# Define region of interest
height, width =
frame.shape[:2]
roi_vertices = [(0, height), (width / 2, height / 2), (width,
height)] roi = region_of_interest(edges, np.array([roi_vertices],
np.int32))
# Detect lines using Hough Transform
lines = cv2.HoughLinesP(roi, 1, np.pi / 180, 100, minLineLength=40, maxLineGap=5)
# Draw the lines on the
frame if lines is not None:
draw_lines(frame, lines)
# Display the result
cv2.imshow('Lane Detection',
frame)
if cv2.waitKey(1) & 0xFF ==
ord('q'):
break
cap.release()
cv2.destroyAllWindows
()
Use Case 9: Real-time Face Recognition using LBPH
import cv2
# Load the LBPH face recognizer
recognizer = cv2.face.LBPHFaceRecognizer_create()
recognizer.read('lbph_model.yml')
# Load the cascade classifier
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# Open video capture
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to grayscale
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)
for (x, y, w, h) in
faces: # Recognize
the face
label, confidence = recognizer.predict(gray_frame[y:y+h, x:x+w])
# Draw a rectangle around the face
cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
cv2.putText(frame, f'ID: {label}, Confidence: {confidence}', (x, y-10),
cv2.FONT_HERSHEY_S
# Display the result
cv2.imshow('Face Recognition',
frame) if cv2.waitKey(1) & 0xFF ==
ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Use Case 10: Real-time Emotion Detection using FER2013
import cv2
import numpy as np
from keras.models import load_model
# Load the pre-trained emotion detection
model emotion_model =
load_model('fer2013_model.h5')
emotion_labels = ['Angry', 'Disgust', 'Fear', 'Happy', 'Sad', 'Surprise', 'Neutral']
# Load the cascade classifier
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# Open video capture
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to grayscale
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)
for (x, y, w, h) in faces:
# Extract the face region of interest
roi_gray = gray_frame[y:y+h, x:x+w]
roi_gray = cv2.resize(roi_gray, (48,
48)) roi_gray = roi_gray / 255.0
roi_gray = np.expand_dims(roi_gray,
axis=0) roi_gray =
np.expand_dims(roi_gray, axis=-1)
# Predict the emotion
emotion_prediction =
emotion_model.predict(roi_gray) max_index =
np.argmax(emotion_prediction)
emotion = emotion_labels[max_index]
# Draw a rectangle around the face
cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0),
2)
# Display the result
cv2.imshow('Emotion Detection',
frame) if cv2.waitKey(1) & 0xFF ==
ord('q'):
brea
k
cap.release()
cv2.destroyAllWindows
()
Use Case 11: Road Sign Detection using HOG and SVM
import
cv2
# Load pre-trained HOG + SVM model for road sign
detection
hog = cv2.HOGDescriptor()
svm = cv2.ml.SVM_load('road_sign_svm_model.yml')
# Load image
image =
cv2.imread('road_sign.jpg')
# Convert image to grayscale
gray = cv2.cvtColor(image,
cv2.COLOR_BGR2GRAY)
# Detect road signs using HOG + SVM
_, hog_features = hog.compute(gray)
result = svm.predict(hog_features.reshape(1,
-1))
# Display result
if result[1][0] == 1:
cv2.putText(image, 'Road Sign Detected', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,
255, 0), 2
cv2.imshow('Road Sign Detection',
image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 12: Person Re-identification using Deep Learning
import cv2
import numpy as np
# Load pre-trained deep learning model for person re-
identification net = cv2.dnn.readNet('person_reid_model.onnx')
# Load images of the person to be re-identified
image1 = cv2.imread('person1.jpg')
image2 = cv2.imread('person2.jpg')
# Prepare images for the model
blob1 = cv2.dnn.blobFromImage(image1, scalefactor=1.0, size=(128, 256), mean=(0, 0, 0),
swapRB=Tru blob2 = cv2.dnn.blobFromImage(image2, scalefactor=1.0, size=(128, 256), mean=(0,
0, 0), swapRB=Tru
# Set the inputs
net.setInput(blob1)
output1 = net.forward()
net.setInput(blob2)
output2 = net.forward()
# Compute the cosine similarity between the two feature vectors
similarity = np.dot(output1, output2.T) / (np.linalg.norm(output1) *
np.linalg.norm(output2))
# Display the result
print('Similarity:', similarity)
Use Case 13: Scene Text Detection using EAST Detector
import cv2
# Load the pre-trained EAST text detector
net = cv2.dnn.readNet('frozen_east_text_detection.pb')
# Load image
image =
cv2.imread('scene_text.jpg') orig
= image.copy()
(H, W) = image.shape[:2]
# Prepare the image
blob = cv2.dnn.blobFromImage(image, 1.0, (W, H), (123.68, 116.78, 103.94), swapRB=True,
crop=False net.setInput(blob)
(scores, geometry) = net.forward(['feature_fusion/Conv_7/Sigmoid',
'feature_fusion/concat_3'])
# Decode the results
(num_rows, num_cols) =
scores.shape[2:4] rects = []
confidences = []
for y in range(0, num_rows):
scores_data = scores[0, 0, y]
x_data0 = geometry[0, 0, y]
x_data1 = geometry[0, 1, y]
x_data2 = geometry[0, 2, y]
x_data3 = geometry[0, 3, y]
angles_data = geometry[0, 4,
y]
for x in range(0,
num_cols): if
scores_data[x] < 0.5:
continue
(offset_x, offset_y) = (x * 4.0, y *
4.0) angle = angles_data[x]
cos =
np.cos(angle) sin
= np.sin(angle)
h = x_data0[x] +
x_data2[x] w = x_data1[x]
+ x_data3[x]
end_x = int(offset_x + (cos * x_data1[x]) + (sin *
x_data2[x])) end_y = int(offset_y - (sin * x_data1[x]) +
(cos * x_data2[x])) start_x = int(end_x - w)
start_y = int(end_y - h)
rects.append((start_x, start_y, end_x, end_y))
confidences.append(scores_data[x])
# Apply non-maxima suppression to suppress weak, overlapping
bounding boxes
boxes = non_max_suppression(np.array(rects), probs=confidences)
# Draw the bounding boxes
for (start_x, start_y, end_x, end_y) in boxes:
cv2.rectangle(orig, (start_x, start_y), (end_x, end_y), (0, 255,
0), 2)
# Display the result
cv2.imshow('Text Detection',
orig) cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 14: Real-time Head Pose Estimation
import cv2
import numpy as np
# Load the pre-trained face detection model
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# Load the pre-trained head pose estimation
model net =
cv2.dnn.readNet('head_pose_estimation.onnx')
# Open video capture
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to grayscale
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)
for (x, y, w, h) in faces:
# Extract the face region of
interest roi = frame[y:y+h, x:x+w]
blob = cv2.dnn.blobFromImage(roi, 1.0, (64, 64), (0, 0, 0), swapRB=True,
crop=False)
# Perform head pose estimation
net.setInput(blob)
output = net.forward()
yaw, pitch, roll = output[0]
# Draw the head pose on the frame
cv2.putText(frame, f'Yaw: {yaw:.2f}, Pitch: {pitch:.2f}, Roll: {roll:.2f}', (x, y-
10), cv2 cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
# Display the result
cv2.imshow('Head Pose Estimation',
frame) if cv2.waitKey(1) & 0xFF ==
ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Use Case 15: Image Inpainting using Deep Learning
import cv2
import numpy as np
# Load the pre-trained inpainting model
net = cv2.dnn.readNet('image_inpainting.onnx')
# Load image with damaged areas
image = cv2.imread('damaged_image.jpg')
mask = cv2.imread('mask.jpg', 0)
# Prepare the image and mask for the model
blob_image = cv2.dnn.blobFromImage(image, 1.0, (512, 512), (0, 0, 0), swapRB=True,
crop=False)
blob_mask = cv2.dnn.blobFromImage(mask, 1.0, (512, 512), (0, 0, 0), swapRB=True,
crop=False)
# Set the inputs
net.setInput(blob_image,
'input_image')
net.setInput(blob_mask,
'input_mask')
# Perform inpainting
output =
net.forward()
# Post-process the output
output = output.squeeze().transpose(1, 2, 0)
output = np.clip(output, 0,
255).astype(np.uint8)
# Display the result
cv2.imshow('Inpainted Image',
output) cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 16: Optical Character Recognition (OCR) using
Tesseract
import cv2
import
pytesseract
# Load image
image =
cv2.imread('document.jpg')
# Convert image to grayscale
gray = cv2.cvtColor(image,
cv2.COLOR_BGR2GRAY)
# Apply thresholding
_, thresh = cv2.threshold(gray, 150, 255,
cv2.THRESH_BINARY_INV)
# Apply dilation
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,
3)) dilated = cv2.dilate(thresh, kernel, iterations=1)
# Extract text using Tesseract OCR
text =
pytesseract.image_to_string(dilated)
print('Extracted Text:',
text)
Use Case 17: Real-time Drowsiness Detection using Eye Aspect
Ratio
import cv2
import dlib
from scipy.spatial import distance
def eye_aspect_ratio(eye):
A = distance.euclidean(eye[1], eye[5])
B = distance.euclidean(eye[2], eye[4])
C = distance.euclidean(eye[0],
eye[3]) ear = (A + B) / (2.0 * C)
return ear
# Load pre-trained models
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')
# Define thresholds
EYE_AR_THRESH = 0.3
EYE_AR_CONSEC_FRAMES = 48
# Initialize
counters counter = 0
# Open video capture
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
gray = cv2.cvtColor(frame,
cv2.COLOR_BGR2GRAY) faces = detector(gray)
for face in faces:
landmarks = predictor(gray, face)
left_eye = []
right_eye = []
for i in range(36, 42):
left_eye.append((landmarks.part(i).x,
landmarks.part(i).y)) for i in range(42, 48):
right_eye.append((landmarks.part(i).x, landmarks.part(i).y))
left_ear = eye_aspect_ratio(left_eye)
right_ear = eye_aspect_ratio(right_eye)
ear = (left_ear + right_ear) /
2.0
if ear < EYE_AR_THRESH:
counter += 1
if counter >= EYE_AR_CONSEC_FRAMES:
cv2.putText(frame, "DROWSINESS DETECTED", (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 1, (
else:
counter = 0
cv2.imshow('Drowsiness Detection',
frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows
()
Use Case 18: Real-time Fire Detection using Color Thresholding
import cv2
import numpy as np
# Open video capture
cap = cv2.VideoCapture('fire_video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to HSV color space
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# Define fire color range in HSV
lower_fire = np.array([18, 50, 50])
upper_fire = np.array([35, 255, 255])
# Apply color thresholding
mask = cv2.inRange(hsv, lower_fire, upper_fire)
# Display the result
cv2.imshow('Fire Detection',
mask)
if cv2.waitKey(1) & 0xFF ==
ord('q'): break
cap.release()
cv2.destroyAllWindows()
Use Case 19: Real-time Smoke Detection using Color and Motion
import cv2
import numpy as np
# Open video capture
cap = cv2.VideoCapture('smoke_video.mp4')
# Initialize background subtractor
background_subtractor = cv2.createBackgroundSubtractorMOG2()
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Apply background subtraction
fg_mask = background_subtractor.apply(gray)
# Define smoke color range in
grayscale lower_smoke =
np.array([100]) upper_smoke =
np.array([255])
# Apply color thresholding
mask = cv2.inRange(fg_mask, lower_smoke, upper_smoke)
# Display the result
cv2.imshow('Smoke Detection',
mask) if cv2.waitKey(1) & 0xFF ==
ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Use Case 20: Real-time Vehicle Detection using HOG and SVM
import cv2
# Load pre-trained HOG + SVM model for vehicle
detection hog = cv2.HOGDescriptor()
svm = cv2.ml.SVM_load('vehicle_svm_model.yml')
# Open video capture
cap = cv2.VideoCapture('vehicle_video.mp4')
while cap.isOpened():
ret, frame =
cap.read() if not ret:
break
# Convert the frame to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect vehicles using HOG + SVM
_, hog_features = hog.compute(gray)
result = svm.predict(hog_features.reshape(1, -1))
# Draw bounding box around detected
vehicles if result[1][0] == 1:
cv2.putText(frame, 'Vehicle Detected', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,
255, 0),
# Display the result
cv2.imshow('Vehicle Detection',
frame) if cv2.waitKey(1) & 0xFF ==
ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Follow me on
Sumit Khanna for more updates