Image processing file
Image processing file
Image preprocessing is the process of manipulating raw image data into a usable and
meaningful format. It allows you to eliminate unwanted distortions and enhance
specific qualities essential for computer vision applications. Preprocessing is a crucial
first step to prepare your image data before feeding it into machine learning models.
To get started with image processing in Python, you’ll need to load and convert your
images into a format the libraries can work with. The two most popular options for this
are OpenCV and Pillow.
Loading images with OpenCV: OpenCV can load images in formats like PNG, JPG,
TIFF, and BMP. You can load an image with:
import cv2
image = cv2.imread(path/to/image.jpg')
This will load the image as a NumPy array. The image is in the BGR color space, so
you may want to convert it to RGB.
Loading images with Pillow: Pillow is a friendly PIL (Python Image Library) fork. It
supports even more formats than OpenCV, including PSD, ICO, and WEBP. You can
load an image with:
Converting between color spaces: You may need to convert images between color
spaces like RGB, BGR, HSV, and Grayscale. This can be done with OpenCV or
Pillow. For example, to convert BGR to Grayscale in OpenCV, use:
image = image.convert('HSV')
Resizing and cropping your images is an important first step in image preprocessing.
Images come in all shapes and sizes, but machine learning algorithms typically require
a standard size. You’ll want to resize and crop your images to square dimensions, often
224x224 or 256x256 pixels.
In Python, you can use the OpenCY or Pillow library for resizing and cropping. With
OpenCV, use the resize() function. For example:
import cv2
img = cv2.imread ('original.jpg')
resized = cV2.resize(img, (224, 224))
To crop an image to a square, you can calculate the center square crop size and use
crop() in OpenCV with the center coordinates. For example:
Code
# Open an image
img = Image.open("example.jpg")
cropped_img = img.crop(crop_box)
cropped_img.show()
# Save the cropped image
cropped_img.save("cropped_example.jpg")
Import Numpy as mp
import numpy as np
print(arr)
Definition of cv2.waitKey(0)
cv2.waitKey(0) is an OpenCV function that waits indefinitely for a key press before
continuing execution.
When working with image data, it’s important to normalize the pixel values to have a
consistent brightness and improve contrast. This makes the images more suitable for
analysis and allows machine learning models to learn patterns independent of lighting
conditions.
Rescaling Pixel Values: The most common normalization technique is rescaling the
pixel values to range from 0 to 1. This is done by dividing all pixels by the maximum
pixel value (typically 255 for RGB images). For example:
import cv2
Img = cv2.imread ('image.jpg')
normalized = img / 255.0
Gaussian Blur:
The Gaussian blur filter reduces detail and noise in an image. It “blurs” the image by
applying a Gaussian function to each pixel and its surrounding pixels. This can help
smooth edges and details in preparation for edge detection or other processing
techniques.
code
import cv2
img = cv2.imread("image.jpg")
cv2.waitKey(0)
cv2.destroyAllWindows()
explain
cv2.GaussianBlur(src, ksize, sigmaX)
# cv2.GaussianBlur(img, (5,5), 0)
cv2.medianBlur(src, ksize)
Example:
python
CopyEdit
import cv2
# Load the image
img = cv2.imread("image.jpg")
# Apply Median Blur with a 5x5 kernel
median_blurred = cv2.medianBlur(img, 5)
# Show the blurred image
cv2.imshow("Median Blurred Image", median_blurred)
cv2.waitKey(0)
cv2.destroyAllWindows()