Deep Learning Computer Vision
Deep Learning Computer Vision
Deep Learning Computer Vision
• Deep learning is an aspect of artificial intelligence (AI) that is to simulate the activity of
the human brain specifically, pattern recognition by passing input through various layers
of the neural network.
Deep-learning architectures such as deep neural networks, deep belief networks, recurrent neural
networks and convolutional neural networks have been applied to fields including computer vision,
• machine vision,
• speech recognition,
• natural language processing,
• audio recognition,
• social network filtering,
• machine translation,
• bioinformatics,
• drug design,
• medical image analysis,
• material inspection and board game programs
HISTORICAL BACKGROUND OF DEEP LEARNING
• All the algorithms that are used in deep learning are largely inspired by the way neurons
and neural networks function and process data in the brain.
• Deep learning is considered an evolution of machine learning.
• It uses a programmable neural network that enables machines to make accurate decisions
without help from humans.
• It is possible to mimic certain parts of neurons,
such as dendrites, cell bodies, and axons
• using simplified mathematical models of what
limited knowledge we have on their inner
workings:
• signals can be received from dendrites, and
• sent down the axon once enough signals were
received.
An artificial neuron behaves in the same way as a biological neuron. So it consists of
• a soma(cell body for processing information),
• dendrites(input), and
• an axon terminal to pass on the output of this neuron to other neurons.
The end of the axon can branch off to connect to many other neurons.
NEURAL NETWORK ELEMENTS
• Whether it’s biological or artificial, the power of a neural network stems from the way
simple neurons are linked to form a complex system greater than the sum of its parts.
• Each neuron can make simple decisions based on mathematical calculations.
• Together, many neurons can analyze complex problems and provide accurate answers. A
shallow network is composed of an input, hidden layer and output layer.
• A deep neural network has more than one hidden layer, which increases the complexity of
the problems it can analyze.
• A neural network learns to complete a task by examining labeled training examples. The
samples must be labeled so the network can learn to distinguish between items using visual
patterns correlated with the labels.
A neural network has three functions:
• Scoring input
• Calculating loss
• Updating the model, which begins the process over again
A neural network is a corrective feedback loop, giving more weight to data that supports
correct guesses and less weight to data that leads to mistakes. A feature known as
backpropagation trains the network to identify correct responses and ignore incorrect
responses.
• Neural networks are primarily used to classify and cluster raw, unlabeled, real-world data.
• They work behind the scenes of familiar technology such as online image comparison or
financial decision-making tools for large corporations.
• A neural network can also look for patterns in web browsing histories to develop
recommendations for users.
CLASSIFICATION
• Neural networks typically excel at classification tasks, which require labeled datasets for
supervised learning.
• neural networks can find visual patterns in thousands of photos and consistently apply
labels at a fast rate.
• A neural network can learn to classify any data with a label that correlates to information
the network can analyze.
CLUSTERING
• While they excel at identifying differences, neural networks also work well for clustering
or detecting similarities.
• A learning neural network can analyze millions of data points and cluster them according
to similarities. This can be applied to images, emails, voice messages or news articles.
• This capability is likewise useful for identifying anomalies, or things that don’t
correspond with group characteristics.
• i.e, clustering is used to identify unusual behavior—such as fraud—by identifying data
that doesn’t correspond with the most common actions.
PREDICTIVE ANALYTICS: REGRESSIONS
• Classification and clustering create a static prediction, such as an image correlating to the
label of a dog. That identification won’t change over time.
• Regression analysis gives neural networks the power to predict future states based on past
events. A future event becomes just another data point.
• the neural network is able to read a string of numbers and predict the next number most
likely to occur.
• It can apply the same analysis to more complex events, such as predicting when a
customer may leave a store or when a piece of manufacturing equipment is likely to fail.
• Regression analysis forms the basis for predictive analytics. By using regression analysis,
a data scientist can model the relationship between a dependent variable (the outcome)
and one or more independent variables (the input).
• Regression analysis will reveal any significant relationships between the independent
variables and the dependent variable, as well as the strength or weight of that impact. In
other words, when the independent variables change, how much and in what way will the
dependent variable change
• A basic neural network uses linear regression to manage one input and one output. Multiple
linear regression comes into play with many input variables. In this case, each node of the
network performs multiple linear regression, weighing each data point as it moves through
the layers. The net tests the inputs as it tries to reduce error.
• Each node acts as a switch to allow or block the input of the nodes around it through the
network. Non-linear regression moves the input through the network until it reaches the
final layer of the net.
• Neural networks use techniques such as gradient descent and backpropagation to refine
their algorithms and find the optimal model for the regression.
APPLICATION OF NEURAL NETWORKS
• Neural networks are integral to the development of machine learning and artificial
intelligence applications.
• used in self-driving cars, facial recognition, language translations and even artistic
endeavors such as creating new colors.
• The growth of artificial intelligence has been fueled by the lower cost of cloud computing
and graphics processing units to manage the flow of images for training. The widespread
availability of electronic images and other data already tagged with information makes
training easier and faster.
• The capabilities to classify, cluster and make predictive decisions have boosted the
integration of neural networks in science research, advertising, e-commerce, customer
service, preventive maintenance and many other disciplines.
• Neural networks scan images of the night sky looking for new astronomical details.
Messaging filters intelligently separate useful and unwanted emails and voice messages.
• Linked with sensors, a predictive analytics system can predict when a hydraulic pump on
a manufacturing machine will need to be serviced before it fails.
VARIOUS APPLICATIONS OF DEEP LEARNING
• Color restoration, where a given image in greyscale is automatically turned into a colored
one. 2. Recognizing handwritten messages. 3. Adding sound to a silent video that matches
the scene taking place. 4. Self-driving cars 5. Computer Vision: for applications like
vehicle number plate identification and facial recognition. 6. Information Retrieval: for
applications like search engines, both text search, and image search. 7. Marketing: for
applications like automated email marketing, and target identification 8. Medical
Diagnosis: for applications like cancer identification, and anomaly detection 9. Natural
Language Processing: for applications like sentiment analysis, and photo tagging 10.
Online Advertising
COMPUTER VISION?
COMPUTER VISION
• Is a field of artificial intelligence (AI) that enables computers and systems to derive
meaningful information from digital images, videos, and other visual inputs — and take
actions or make recommendations based on that information.
• computer vision enables them to see, observe and understand. By applying machine
learning (ML) models to images,
• Ie. classify objects and respond—like unlocking your smartphone when it recognizes
your face. From enhancing your selfies with a fun fox filter to detecting lung lesions in
medical images,
Two essential technologies component for computer vision are called
• Deep learning
• Convolutional neural network (CNN)
Algorithms enable the machine to learn by itself, rather than someone programming it to
recognize an image.
A CNN helps a machine learning or deep learning model “look” by breaking images down
into pixels that are given tags or labels.
It uses the labels to perform convolutions (a mathematical operation on two functions to
produce a third function) and makes predictions about what it is “seeing.”
• The neural network runs convolutions and checks the accuracy of its predictions in a
series of iterations until the predictions start to come true. It is then recognizing or seeing
images in a way similar to humans.
• A CNN first discerns hard edges and simple shapes, then fills in information as it runs
iterations of its predictions.
• A CNN is used to understand single images.
• A recurrent neural network (RNN) is used in a similar way for video applications to help
computers understand how pictures in a series of frames are related to one another.
• Image classification
• Object detection can use image classification to identify a certain class of image and
then detect and tabulate their appearance in an image or video
• Object tracking follows or tracks an object once it is detected. This task is often
executed with images captured in sequence or real-time video feeds.
• Content-based image retrieval uses computer vision to browse, search and retrieve
images from large data stores, based on the content of the images rather than metadata
tags associated with them.
COMPUTER OF A COMPUTER VISION SYSTEM
COMPUTER VISION VS HUMAN VISION
COMPUTER VISION IS A FIELD THAT INCLUDES METHODS FOR
• Acquiring
• Processing
• Analyzing
• Understanding image
• Produce numerical information
COMPUTER VISION TASK
• Recognition
• Motion analysis
• Seen reconstruction
• Image restoration
The organization of computer vision system in highly application depend. There are function
which are found in many computer vision system
• Image acquisition
• Preprocessing
• Feature extraction
• Detection/segmentation
• High level processing
• Decision
A CONVOLUTIONAL NEURAL NETWORK ( CNN )
• is a type of neural network for working with images, This type of neural network takes
input from an image and extract features from an image and provide learnable parameters
to efficiently do the classification, detection and a lot more tasks.
• We extract the features from the images using something called “filters”, we have
different filters used to extract different features from the images.
WE CONVOLUTE OUR IMAGE USING FILTERS
USING CONVOLUTION OPERATIONS,
• We take our image ( 5 x 5 ), here we have greyscale image and then we take our learnable
filters ( 3 x3 ) and then we do the convolution operation.
• You do the element wise product and then you sum it all up and then you fill the first cell.
[ 4 * 0 + 1 * 2 + 1 * 3 + 0 * 0 + 1 * 1 + 2 * 1 + 3 * 1 + 2 * 0 + 5 * 1 = 16 ]
THE CONVOLVED FEATURES
When you define the network, the convolved features
are controlled by three parameters
Depth: It defines the number of filters to apply
during the convolution
Stride: It defines the number of “pixel’s jump”
between two slices.
Zero-padding: A padding is an operation of
adding a corresponding number of rows and
column on each side of the input features maps.
37
STRIDE CONVOLUTIONS:- PADDING
• A Convolutional neural network (CNN) is a neural network that has one or more
convolutional layers and is used mainly for image processing, classification,
segmentation.
CNN
41
42
43
44
45
IMAGE CLASSIFICATION
• Image Classification:- It’s the process of extracting information from the images and
labelling or categorizing the images. There are two types of classification:-
1. Binary classification:- In this type of classification our output is in binary value either 0
or 1, let’s take an example that you’re given an image of a cat and you have to detect
whether the image is of cat or non-cat.
2. Multi-class classification:- In this type of classification our output will be multi-class, so
let’s take an example that you’re given an image and you have to detect the breed of dog
among 37 classes.
STEPS FOR IMAGE CLASSIFICATION
USING CNN:
47