Object Detection

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 57

A Project report

On

OBJECT DETECTION USING MATLAB

Submitted in partial fulfillment of the requirement for the award of degree of

BACHELOR OF TECHNOLOGY
in
ELECTRONICS & COMMUNICATION ENGINEERING

Submitted by

R. SADANAND (16311A04V3)
CH. AMAN PRASAD (16311A04W3)
Y. MUKESH REDDY (16311A04W4)

Under the Guidance of

Dr.CN SUJATHA Ms. E.LAVANYA


Professor Assistant Professor
Department of ECE Department of ECE

1
Department of Electronics and Communication Engineering
SREENIDHI INSTITUTE OF SCIENCE AND TECHNOLOGY
(Affiliated to Jawaharlal Nehru Technological University, Hyderabad)

Yamnampet (V), Ghatkesar (M), Hyderabad – 501301, A.P.


2019-2020
Department Of Electronics and Communication Engineering
SREENIDHI INSTITUTE OF SCIENCE AND TECHNOLOGY
(Affiliated to Jawaharlal Nehru Technological University, Hyderabad)

Yamnampet (V), Ghatkesar (M), Hyderabad – 501301, A.P.

CERTIFICATE

This is to certify that the project entitled “OBJECT DETECTION USING


MATLAB” is being submitted by

R. SADANAND (16311A04V3)
CH. AMAN PRASAD (16311A04W3)
2
Y. MUKESH REDDY (16311A04W4)

in partial fulfillment of the requirements for the award of BACHELOR OF


TECHNOLOGY to JNTU, Hyderabad. This record is a bonafide work carried out by
them under my guidance and supervision. The result embodied in this project report has
not been submitted to any other university or institute for the award of any degree of
diploma.

Internal guide Project Co-ordinator


Ms. E. Lavanya Dr.C.N.Sujatha
Assistant Professor Professor
Department of ECE Department of ECE

Head of Department
Dr.S.P.V.Subba Rao
Professor- HOD
Department of ECE

3
DECLARATION

This is to certify that the work reported in the present thesis titled “OBJECT DETECTION
USING MATLAB” is a record work done by me/us in the Department of Electronics and
Communication Engineering, Sreenidhi Institute of Science and Technology,
Yamnampet, Ghatkesar.
No part of the thesis is copied from books/ journals/ internet and wherever the portion is
taken; the same has been duly referred in the text. The report is based on the project work
done entirely by me/ us and not copied from any other source.

R. SADANAND – 16311A04V3
CH. AMAN PRASAD-16311A04W3
Y. MUKESH REDDY-16311A04W4

1
ACKNOWLEDGMENT
We would like to express our sincere gratitude and thanks to
Ms.E.Lavanya, Internal Guide, Department of Electronics and Communication
Engineering, Sreenidhi Institute of Science and Technology for allowing us to take up this
project.

We would specially like to express our sincere gratitude and thanks to, Mrs. C.N. Sujatha,
Project Coordinator, Department of Electronics and Communication Engineering,
Sreenidhi Institute of Science and Technology for guiding us throughout the project.

We are very grateful to Dr. S.P.V. Subba Rao, Head of the Department of Electronics
and Communication Engineering, Sreenidhi Institute of Science and Technology for
allowing us to take up this project.

We are very grateful to Dr.Ch.Shiva Reddy, Principal of Sreenidhi Institute of Science


and Technology for having provided the opportunity for taking up this project.
We are very grateful to Dr. P. Narasimha Reddy, Executive Director of Sreenidhi
Institute of Science and Technology for having provided the opportunity for taking up this
project.

We also extend our sincere thanks to our parents and friends for their moral support
throughout the project work.

2
ABSTRACT

Object detection is most prevalent step of video analytics. Performance at higher level is greatly
depends on accurate performance of object detection. Various platforms are being used for
designing and implementation of object detection algorithm. It includes C programming,
MATLAB and Simulink, open cv etc. Among these, MATLAB programming is most popular in
students and researchers due to its extensive features. These features include data processing using
matrix, set of toolboxes and Simulink blocks covering all technology fields, easy programming,
and Help topics with numerous examples. This paper presents the implementation of object
detection and tracking using MATLAB. It demonstrates the basic block diagram of object
detection and explains various predefined functions and object from different toolboxes that can
be useful at each level in object detection. Useful tool boxes include image acquisition, image
processing, and computer vision. This study helps new researcher in object detection field to
design and implement algorithms using MATLAB.

3
CONTENTS

Abstract 3

Contents 4

Chapter 1 Introduction 6-8


1.1 Introduction to the project 6
1.2 What is Object Detection? 6
1.3 Why Object Detection matters? 7
1.4 How is it currently being used? 8
1.5 What Potential does it have? 8

Chapter 2 Literature Survey 9-10

Chapter 3 Block Diagram 11-12


3.1 Significance of Block Diagram 11
3.1.1 Video Input 11
3.1.2 Pre-Processing 11
3.1.3 Object Detection 11
3.1.4 Post-Processing 11

Chapter 4 Introduction to MATLAB 13-24


4.1 Matlab 13
4.2 History 13
4.3 Why to use Matlab? 13
4.4 Syntax 14
4
4.5 Variables 14
4.6 Functions 17
4.6.1 Anonymous Functions 18
4.6.2 Primary and Sub Functions 19
4.6.3 Nested Functions 20
4.6.4 Private Functions 21
4.6.5 Global Variables 22
4.6.6 Function Handlings 23

Chapter 5 Matlab Toolboxes 25-30


5.1 Computer Vision 25
5.1.1 Applications 25
5.1.2 CV toolbox 27
5.2 Image Processing 28
5.3 Image Sensors 28
5.4 Image Compression 28
5.5 Digital Signal Processing(DSP) 29
5.5.1 Medical Imaging 29
5.6 Image Acquisition Toolbox 30

Chapter 6 Matlab Implementation 31-37

Chapter 7 Source Code 38-40

Chapter 8 Result 40

Chapter 9 Applications 41-44

Chapter 10 Conclusion 45
10.1 Conclusion 45
10.2 Future Scope 45

Chapter 11 References 45

5
Chapter 1
Introduction to Object Detection
1.1 INTRODUCTION:
Video analytics is popular segment of computer vision. It has enormous applications such as traffic
monitoring, parking lot management, crowd detection, object recognition, unattended baggage
detection, secure area monitoring, etc. Object detection is critical step in video analytics. The
performance at this step is important for scene analysis, object matching and tracking, activity
recognition. Over the years, research is flowing towards innovating new concept and improving
or extending the established research for performance improvement of object detection and
tracking. Various object detection approaches has been developed based on statistic, fuzzy, neural
network etc. Most approaches involve complex theory. These approaches can be evolved further
by thorough understanding, implementation and experimentation. All these approaches can be
learned by reading, reviewing, and taking professor’s expert guidance. However, implementation
and experimentation requires good programmer. Various platforms are being used for the design
and implementation of object detection and tracking algorithm. These platforms include C
programming, Open CV, MATLAB etc. The object detection system to be used in real time should
satisfy two conditions. First, system code must be short in terms of execution time. Second, it must
efficiently use memory. However, programmer must have good programming skill in case of
programming in C and OpenCV. Moreover, it is time intensive too for new researcher to develop
such efficient code for real time use.Assuming all these facts, the MATLAB is found as better
platform to design and implementation of algorithm. It contains more than seventy toolboxes
covering all possible fields in technology. All toolboxes are rich with predefined functions, system

6
objects and simulink blocks. This feature helps to write short code and saves time in logic
development at various steps in system. MATLAB supports matrix operation which is huge
advantage during processing of an image or frame in video sequence. MATLAB coding is simple
and easily learned by any new researcher. This paper presents implementation of object detection
system using MATLAB and its toolboxes. This study explored various toolboxes and identified
useful functions and objects that can be used at various levels in object detection and tracking.
Toolboxes mainly include computer vision, image processing, and image acquisition. MATLAB
2012 version is used for this study. This paper organized in four section second section describe
general block diagram of object detection. Third section involves MATLAB functions and objects
that are useful in implementation of object detection system. Sample coding is presented for object
detection and tracking in section four. Paper is concluded in fifth section.

1.2 What is Object Detection?


Object Detection is a task of finding and identifying object in an image or video sequence.The goal
of instance level recognition is to recognize a specific object or scene.It is a computer technology
related to computer vision and image processing that deals with detecting instances of semantic
objects of a certain class(such as humans, buildings, or cars) in digital images and videos.
1.3 Why object detection matters?
Object detection is a key technology behind advanced driver assistance systems (ADAS) that
enable cars to detect driving lanes or perform pedestrian detection to improve road safety. Object
detection is also useful in applications such as video surveillance or image retrieval systems.

Today, images and video are everywhere. Online photo sharing sites and social networks have
them in the billions. The field of vision research[1,] has been dominated by machine learning and
statistics. Using images and video to detect, classify, and track objects or events in order to
”understand” a real-world scene. Programming a computer and designing algorithms for
understanding what is in these images is the field of computer vision. Computer vision powers
applications like image search, robot navigation, medical image analysis, photo management and

7
many more. From a computer vision point of view, the image is a scene consisting of objects of
interest and a background represented by everything else in the image. The relations and
interactions among these objects are the key factors for scene understanding. Object detection and
recognition are two important computer vision tasks. Object detection determines the presence of
an object and/or its scope, and locations in the image. Object recognition identifies the object class
in the training database, to which the object belongs to. Object detection typically precedes object
recognition. It can be treated as a two-class object recognition, where one class represents the
object class and another class represents non-object class. Object detection can be further divided
into soft detection, which only detects the presence of an object, and hard detection, which detects
both the presence and location of the object. Object detection field is typically carried out by
searching each part of an image to localize parts. This can be accomplished by scanning an object
template across an image at different locations, scales, and rotations, and a detection is declared if
the similarity between the template and the image is sufficiently high. The similarity between a
template and an image region can be measured by their correlation (SSD). Over the last several
years it has been shown that image based object detectors are sensitive to the training data.

1.4 How is it currently being used?


Object detection is breaking into a wide range of industries, with use cases ranging from personal
security to productivity in the workplace. Facial detection is one form of it, which can be utilized
as a security measure to let only certain people into a highly classified area of a government
building, for example. It can be used to count the number of people present within a business
meeting to automatically adjust other technical tools that will help streamline the time dedicated
to that particular meeting. It can also be used within a visual search engine to help consumers find
a specific item they’re on the hunt for – Pinterest is one example of this, as the entire social and
shopping platform is built around this technology. These features utilize people and object
detection to create big data for a variety of applications in the workplace.

1.5 What potential does it have?


The possibilities are endless when it comes to future use cases for object detection.
Sports broadcasting will be utilizing this technology in instances such as detecting when a
football team is about to make a touchdown and notifying fans via their mobile phone or at-home
virtual reality setup in a highly creative way.
In video collaboration, business leaders will be able to count the number of participants within a
meeting to help them automate the process further and monitor room usage to ensure spaces are
being used properly. A relatively new “people counting” method that detects heads rather than
bodies and motion will allow for more accurate detecting to take place, in crowded places
specifically (IEEE), which will enable even more applications for the security industry.

8
The future of object detection has massive potential across a wide range of industries. We are
thrilled to be one of the main drivers behind real time intelligent vision, high performance
computing, artificial intelligence and machine learning, which has allowed us to create a solution
that will never distort video, allowing for various AI capabilities which other companies simply
cannot enable.

CHAPTER 2
LITERATURE SURVEY
The object detection task can be addressed by considering the video as an unrelated sequence of
frames and perform static object detection In 2009, Felzenszwalb et al. [1] described an object
detection system based on mixtures of multiscale deformable part models. Their system was able
to represent highly variable object classes and achieves state-of-the-art results in the PASCAL
object detection challenges. They combined a margin-sensitive approach for data-mining hard
negative examples with a formalism we call latent SVM. This led to an iterative training algorithm
that alternates between fixing latent values for positive examples and optimizing the latent SVM
objective function. Their system relied heavily on new methods for discriminative training of
classifiers that make use of latent information. It also relied heavily on efficient methods for
matching deformable models to images. The described framework allows for exploration of
additional latent structure. For example, one can consider deeper part hierarchies (parts with parts)
or mixture models with many components. Leibe et al. [2] in 2007, presented a novel method for
detecting and localizing objects of a visual category in cluttered real-world scenes. Their approach
9
considered object categorization and figure-ground segmentation as two interleaved processes that
closely collaborate towards a common goal. The tight coupling between those two processes
allows them to benefit from each other and improve the combined performance. The core part of
their approach was a highly flexible learned representation for object shape that could combine the
information observed on different training examples in a probabilistic extension of the Generalized
Hough Transform. As they showed, the resulting approach can detect categorical objects in novel
images and automatically infer a probabilistic segmentation from the recognition result. This
segmentation was then in turn used to again improve recognition by allowing the system to focus
its efforts on object pixels and to discard misleading influences from the background. Their
extensive evaluation on several large data sets showed that the proposed system was applicable to
a range of different object categories, including both rigid and articulated objects. In addition, its
flexible representation allowed it to achieve competitive object detection performance already
from training sets that were between one and two orders of magnitude smaller than those used in
comparable systems. Recently in last decade, methods based on local image features have shown
promise for texture and object recognition tasks. Zhang et al. [3] in 2006, presented a large-scale
evaluation of an approach that represented images as distributions (signatures or histograms) of
features extracted from a sparse set of key-point locations and learnt a Support Vector Machine
classifier with kernels based on two effective measures for comparing distributions. They first
evaluated the performance of the proposed approach with different key-point detectors and
descriptors, as well as different kernels and classifiers. Then, they conducted a comparative
evaluation with several modern recognition methods on 4 texture and 5 object databases. On most
of those databases, their implementation exceeded the best reported results and achieved
comparable performance on the rest. Additionally, we also investigated the influence of
background correlations on recognition performance. In 2001, Viola and Jones [4] in a conference
on pattern recognition described a machine learning approach for visual object detection which
was capable of processing images extremely rapidly and achieving high detection rates. Their work
was distinguished by three key contributions. The first was the introduction of a new image
representation called the "integral image" which allowed the features used by their detector to be
computed very quickly. The second was a learning algorithm, based on AdaBoost, which used to
select a small number of critical visual features from a larger set and yield extremely efficient
classifiers. The third contribution was a method for combining increasingly more complex
classifiers in a "cascade" which allowed background regions of the image to be quickly discarded
while spending more computation on promising object-like regions. The cascade could be viewed
as an object specific focus-of-attention mechanism which unlike some of the previous approaches
provided statistical guarantees that discarded regions were unlikely to contain the object of interest.
They had done some testing over face detection where the system yielded detection rates
comparable to the best of previous systems. Used in real-time applications, the detector runs at 15
frames per second without resorting to image differencing or skin color detection. In 2000, Weber
et al. [5] proposed a method to learn heterogeneous models of object classes for visual recognition.
The training images, that they used, contained a preponderance of clutter and the learning was
10
unsupervised. Their models represented objects as probabilistic constellations of rigid parts
(features). The variability within a class was represented by a join probability density function on
the shape of the constellation and the appearance of the parts. Their method automatically
identified distinctive features in the training set. The set of model parameters was then learned
using expectation maximization. When trained on different, unlabeled and non-segmented views
of a class of objects, each component of the mixture model could adapt to represent a subset of the
views. Similarly, different component models could also specialize on sub-classes of an object
class. Experiments on images of human heads, leaves from different species of trees, and motor-
cars demonstrated that the method works well over a wide variety of objects.

CHAPTER 3
BLOCK DIAGRAM

This section explains general block diagram of object detection and significance of each block in
the system. Common object detection mainly includes video input, preprocessing, object
segmentation, post processing. It is shown in Fig.

11
The significance of each block is as follows

Video Input:- It can be stored video or real time video.

Preprocessing:-It mainly involves temporal and spatial smoothing such as intensity adjustment,
removal of noise. For real-time systems, frame-size and frame-rate reduction are commonly used.
It highly reduces computational cost and time[1].

Object detection: It is the process of change detection and extracts appropriate change for further
analysis and qualification. Pixels are classified as foreground, if they

changed. Otherwise, they are considered as background. This process is called as back ground
subtraction. The degree of "change" is a key factor in segmentation and can vary depending on the
application. The result of segmentation is one or more foreground blobs, a blob being a collection
of connected pixels [1].

Post processing: Remove false detection caused due to dynamic condition in background using
morphological and speckle noise removal. BMC 2012 Dataset[6]: This dataset include real and
synthetic video. It is mainly used for comparison of different background subtraction techniques.

12
Fish4knowledge Dataset[7]: The Fish4 knowledge 35 dataset is an underwater benchmark dataset
for target detection against complex background. Carnegie Mellon Dataset[8]:

The sequence of CMU25 by Sheikh and Shah involves a camera mounted on a tall tripod. The
wind caused the tripod to sway back and forth causing vibration in the scene. This dataset is useful
while studying camera jitter background Situation. Stored video need to be read in appropriate
format before processing. Various related functions from image processing(IP) and computer
vision(CV) toolbox can be used for this purpose.

13
Chapter 4
Introduction to MATLAB
4.1 MATLAB:(matrix laboratory) is a multi-paradigm numerical computing environment and
proprietary programming language developed by MathWorks. MATLAB allows matrix
manipulations, plotting of functions and data, implementation of algorithms, creation of user
interfaces, and interfacing with programs written in other languages.
Although MATLAB is intended primarily for numerical computing, an optional toolbox uses the
MuPAD symbolic engine allowing access to symbolic computing abilities. An additional package,
Simulink, adds graphical multi-domain simulation and model-based design for dynamic and
embedded systems.
As of 2018, MATLAB has more than 3 million users worldwide.MATLAB users come from
various backgrounds of engineering, science, and economics.

4.2 HISTORY:
Cleve Moler, the chairman of the computer science department at the University of New Mexico,
started developing MATLAB in the late 1970s.He designed it to give his students access to
LINPACK and EISPACK without them having to learn Fortran. It soon spread to other universities
and found a strong audience within the applied mathematics community. Jack Little, an engineer,
was exposed to it during a visit Moler made to Stanford University in 1983. Recognizing its
commercial potential, he joined with Moler and Steve Bangert. They rewrote MATLAB in C and
founded MathWorks in 1984 to continue its development. These rewritten libraries were known
as JACKPAC. In 2000, MATLAB was rewritten to use a newer set of libraries for matrix
manipulation, LAPACK.
MATLAB was first adopted by researchers and practitioners in control engineering, Little's
specialty, but quickly spread to many other domains. It is now also used in education, in particular
the teaching of linear algebra and numerical analysis, and is popular amongst scientists involved
in image processing.

4.3 Why to use MATLAB?


 Fast development: Fast and good programming with fewer bugs compared with OpenCV
since a wide range of functions are available and has support for displaying and manipulate
data. Fast coding is a positive side of Matlab that allows you to develop quickly vision
applications, but it is slower at execution time, which is a disadvantage point.
 Fast debugging: Matlab doesn’t have specific programming problems like memory
allocation and it can stop automatically the script when encountered a problem. Also, it
14
allows users to execute code using command lines even an error occurs and fix the error
while the code is still in execution mode. Due to the fact that Matlab can execute code
during debugging is an advantage compared with other IDE tools.
 Clear code: Matlab has a concise code that makes easier to write code, understand, and
for debugging.
 Documentation: Matlab has a comprehensive documentation with a lot of examples and
explanations.

4.4 SYNTAX
The MATLAB application is built around the MATLAB programming language. Common usage
of the MATLAB Assignment Help application involves using the "Command Window" as an
interactive mathematical shell or executing text files containing MATLAB code.

4.5 VARIABLES
Variables are defined using the assignment operator, =. MATLAB is a weakly typed programming
language because types are implicitly converted. It is an inferred typed language because variables
can be assigned without declaring their type, except if they are to be treated as symbolic objects,
and that their type can change. Values can come from constants, from computation involving
values of other variables, or from the output of a function. For example:
>> x = 17
x=
17

>> x = 'hat'
x=
hat

>> x = [3*4, pi/2]


x=
12.0000 1.5708

>> y = 3*sin(x)

15
y=
-1.6097 3.0000

A simple array is defined using the colon syntax: initial:increment:terminator. For instance:
>> array = 1:2:9
array =
13579

defines a variable named array (or assigns a new value to an existing variable with the name array)
which is an array consisting of the values 1, 3, 5, 7, and 9. That is, the array starts at 1 (the initial
value), increments with each step from the previous value by 2 (the increment value), and stops
once it reaches (or to avoid exceeding) 9 (the terminator value).
>> array = 1:3:9
array =
147

the increment value can actually be left out of this syntax (along with one of the colons), to use a
default value of 1.
>> ari = 1:5
ari =
12345

assigns to the variable named ari an array with the values 1, 2, 3, 4, and 5, since the default value
of 1 is used as the increment.
Indexing is one-based,which is the usual convention for matrices in mathematics, unlike zero-
based indexing commonly used in other programming languages such as C, C++, and Java.
Matrices can be defined by separating the elements of a row with blank space or comma and using
a semicolon to terminate each row. The list of elements should be surrounded by square brackets
[]. Parentheses () are used to access elements and subarrays (they are also used to denote a function
argument list).
>> A = [16 3 2 13; 5 10 11 8; 9 6 7 12; 4 15 14 1]
16
A=
16 3 2 13
5 10 11 8
9 6 7 12
4 15 14 1

>> A(2,3)
ans =
11

Sets of indices can be specified by expressions such as 2:4, which evaluates to [2, 3, 4]. For
example, a submatrix taken from rows 2 through 4 and columns 3 through 4 can be written as:
>> A(2:4,3:4)
ans =
11 8
7 12
14 1

A square identity matrix of size n can be generated using the function eye, and matrices of any size
with zeros or ones can be generated with the functions zeros and ones, respectively.
>> eye(3,3)
ans =
100
010
001

>> zeros(2,3)
ans =
17
000
000

>> ones(2,3)
ans =
111
111

Transposing a vector or a matrix is done either by the function transpose or by adding dot-prime
after the matrix (without the dot, prime will perform conjugate transpose for complex arrays):
>> A = [1 ; 2], B = A.', C = transpose(A)
A=
1
2
B=
1 2
C=
1 2

>> D = [0 3 ; 1 5], D.'


D=
0 3
1 5
ans =
0 1
3 5
Most MATLAB functions accept arrays as input and operate element-wise on each element. For
example, mod(2*J,n) will multiply every element in J by 2, and then reduce each element modulo
18
n. MATLAB does include standard for and while loops, but (as in other similar applications such
as R), using the vectorized notation is encouraged and is often faster to execute. The following
code, excerpted from the function magic.m, creates a magic square M for odd values of n
(MATLAB function meshgrid is used here to generate square matrices I and J containing 1:n).
[J,I] = meshgrid(1:n);
A = mod(I + J - (n + 3) / 2, n);
B = mod(I + 2 * J - 2, n);
M = n * A + B + 1;

Structures
MATLAB supports structure data types. Since all variables in MATLAB are arrays, a more
adequate name is "structure array", where each element of the array has the same field names. In
addition, MATLAB supports dynamic field names(field look-ups by name, field manipulations,
etc.).

4.6 FUNCTIONS
A function is a group of statements that together perform a task. In MATLAB, functions are
defined in separate files. The name of the file and of the function should be the same.
Functions operate on variables within their own workspace, which is also called the local
workspace, separate from the workspace you access at the MATLAB command prompt which is
called the base workspace.
Functions can accept more than one input arguments and may return more than one output
arguments.
Syntax of a function statement is −
function [out1,out2, ..., outN] = myfun(in1,in2,in3, ..., inN)

Example
The following function named mymax should be written in a file named mymax.m. It takes five
numbers as argument and returns the maximum of the numbers.
Create a function file, named mymax.m and type the following code in it −

19
function max = mymax(n1, n2, n3, n4, n5)

%This function calculates the maximum of the


% five numbers given as input
max = n1;
if(n2 > max)
max = n2;
end
if(n3 > max)
max = n3;
end
if(n4 > max)
max = n4;
end
if(n5 > max)
max = n5;
end

The first line of a function starts with the keyword function. It gives the name of the function and
order of arguments. In our example, the mymax function has five input arguments and one output
argument.
The comment lines that come right after the function statement provide the help text. These lines
are printed when you type −
help mymax

MATLAB will execute the above statement and return the following result −
This function calculates the maximum of the
five numbers given as input

20
You can call the function as −
mymax(34, 78, 89, 23, 11)

MATLAB will execute the above statement and return the following result −
ans = 89

4.6.1 Anonymous Functions


An anonymous function is like an inline function in traditional programming languages, defined
within a single MATLAB statement. It consists of a single MATLAB expression and any number
of input and output arguments.
You can define an anonymous function right at the MATLAB command line or within a function
or script.
This way you can create simple functions without having to create a file for them.
The syntax for creating an anonymous function from an expression is
f = @(arglist)expression

Example
In this example, we will write an anonymous function named power, which will take two numbers
as input and return first number raised to the power of the second number.
Create a script file and type the following code in it −

power = @(x, n) x.^n;


result1 = power(7, 3)
result2 = power(49, 0.5)
result3 = power(10, -10)
result4 = power (4.5, 1.5)

When you run the file, it displays −


result1 = 343

21
result2 = 7
result3 = 1.0000e-10
result4 = 9.5459

4.6.2 Primary and Sub-Functions


Any function other than an anonymous function must be defined within a file. Each function file
contains a required primary function that appears first and any number of optional sub-functions
that comes after the primary function and used by it.
Primary functions can be called from outside of the file that defines them, either from command
line or from other functions, but sub-functions cannot be called from command line or other
functions, outside the function file.
Sub-functions are visible only to the primary function and other sub-functions within the function
file that defines them.

Example
Let us write a function named quadratic that would calculate the roots of a quadratic equation.
The function would take three inputs, the quadratic co-efficient, the linear co-efficient and the
constant term. It would return the roots.
The function file quadratic.m will contain the primary function quadratic and the sub-function
disc, which calculates the discriminant.
Create a function file quadratic.m and type the following code in it −
function [x1,x2] = quadratic(a,b,c)

%this function returns the roots of


% a quadratic equation.
% It takes 3 input arguments
% which are the co-efficients of x2, x and the
%constant term
% It returns the roots
d = disc(a,b,c);

22
x1 = (-b + d) / (2*a);
x2 = (-b - d) / (2*a);
end % end of quadratic

function dis = disc(a,b,c)


%function calculates the discriminant
dis = sqrt(b^2 - 4*a*c);
end % end of sub-function

You can call the above function from command prompt as −


quadratic(2,4,-4)

MATLAB will execute the above statement and return the following result −
ans = 0.7321

4.6.3 Nested Functions


You can define functions within the body of another function. These are called nested functions.
A nested function contains any or all of the components of any other function.
Nested functions are defined within the scope of another function and they share access to the
containing function's workspace.
A nested function follows the following syntax −
function x = A(p1, p2)
...
B(p2)
function y = B(p3)
...
end
...
end

23
Example
Let us rewrite the function quadratic, from previous example, however, this time the disc function
will be a nested function.
Create a function file quadratic2.m and type the following code in it −
function [x1,x2] = quadratic2(a,b,c)
function disc % nested function
d = sqrt(b^2 - 4*a*c);
end % end of function disc

disc;
x1 = (-b + d) / (2*a);
x2 = (-b - d) / (2*a);
end % end of function quadratic2

You can call the above function from command prompt as −


quadratic2(2,4,-4)

MATLAB will execute the above statement and return the following result −
ans = 0.73205

4.6.4 Private Functions


A private function is a primary function that is visible only to a limited group of other functions.
If you do not want to expose the implementation of a function(s), you can create them as private
functions.
Private functions reside in subfolders with the special name private.
They are visible only to functions in the parent folder.

Example
Let us rewrite the quadratic function. This time, however, the disc function calculating the
discriminant, will be a private function.
24
Create a subfolder named private in working directory. Store the following function file disc.m in
it −
function dis = disc(a,b,c)
%function calculates the discriminant
dis = sqrt(b^2 - 4*a*c);
end % end of sub-function
Create a function quadratic3.m in your working directory and type the following code in it −
function [x1,x2] = quadratic3(a,b,c)

%this function returns the roots of


% a quadratic equation.
% It takes 3 input arguments
% which are the co-efficient of x2, x and the
%constant term
% It returns the roots
d = disc(a,b,c);

x1 = (-b + d) / (2*a);
x2 = (-b - d) / (2*a);
end % end of quadratic3

You can call the above function from command prompt as −


quadratic3(2,4,-4)

MATLAB will execute the above statement and return the following result −
ans = 0.73205

4.6.5 Global Variables

25
Global variables can be shared by more than one function. For this, you need to declare the
variable as global in all the functions.
If you want to access that variable from the base workspace, then declare the variable at the
command line.
The global declaration must occur before the variable is actually used in a function. It is a good
practice to use capital letters for the names of global variables to distinguish them from other
variables.

Example
Let us create a function file named average.m and type the following code in it −
function avg = average(nums)
global TOTAL
avg = sum(nums)/TOTAL;
end
Create a script file and type the following code in it −
global TOTAL;
TOTAL = 10;
n = [34, 45, 25, 45, 33, 19, 40, 34, 38, 42];
av = average(n)

When you run the file, it will display the following result −
av = 35.500

4.6.6 Function handles


MATLAB supports elements of lambda calculus by introducing function handles, or function
references, which are implemented either in .m files or anonymous]/nested functions.

Classes and object oriented programming


MATLAB supports object-oriented programming including classes, inheritance, virtual dispatch,
packages, pass-by-value semantics, and pass-by-reference semantics]However, the syntax and
calling conventions are significantly different from other languages. MATLAB has value classes
and reference classes, depending on whether the class has handled as a super-class (for reference
classes) or not (for value classes).[31]
26
Method call behavior is different between value and reference classes. For example, a call to a
method
object.method();

can alter any member of object only if object is an instance of a reference class, otherwise value
class methods must return a new instance if it needs to modify the object.
An example of a simple class is provided below.
classdef Hello

methods
function greet(obj)
disp('Hello!')

end
end
end

When put into a file named hello.m, this can be executed with the following commands:
>> x = Hello();
>> x.greet();
Hello!

Interface with other languages


MATLAB can call functions and subroutines written in the programming languages C or Fortran.]
A wrapper function is created allowing MATLAB data types to be passed and returned. MEX files
(MATLAB executables) are the dynamically loadable object files created by compiling such
functions. Since 2014 increasing two-way interfacing with Python was being added.
Libraries written in Perl, Java, ActiveX or .NET can be directly called from MATLAB, and many
MATLAB libraries (for example XML or SQL support) are implemented as wrappers around Java
or ActiveX libraries. Calling MATLAB from Java is more complicated, but can be done with a
MATLAB toolbox which is sold separately by MathWorks, or using an undocumented mechanism
called JMI (Java-to-MATLAB Interface), (which should not be confused with the unrelated Java
Metadata Interface that is also called JMI). Official MATLAB API for Java was added in 2016.

27
As alternatives to the MuPAD based Symbolic Math Toolbox available from MathWorks,
MATLAB can be connected to Maple or Mathematica.
Libraries also exist to import and export MathML.

Chapter 5
MATLAB Toolboxes

5.1 Computer Vision


Computer vision is an interdisciplinary scientific field that deals with how computers can be made
to gain high-level understanding from digital images or videos. From the perspective of
engineering, it seeks to automate tasks that the human visual system can do.
Computer vision tasks include methods for acquiring, processing, analyzing and understanding
digital images, and extraction of high-dimensional data from the real world in order to produce
numerical or symbolic information, e.g. in the forms of decisions.Understanding in this context
means the transformation of visual images (the input of the retina) into descriptions of the world
that can interface with other thought processes and elicit appropriate action. This image
understanding can be seen as the disentangling of symbolic information from image data using
models constructed with the aid of geometry, physics, statistics, and learning theory.
The scientific discipline of computer vision is concerned with the theory behind artificial systems
that extract information from images. The image data can take many forms, such as video
sequences, views from multiple cameras, or multi-dimensional data from a medical scanner. The

28
technological discipline of computer vision seeks to apply its theories and models to the
construction of computer vision systems.
Sub-domains of computer vision include scene reconstruction, event detection, video tracking,
object recognition, 3D pose estimation, learning, indexing, motion estimation, and image
restoration.
5.1.1 Applications
Applications range from tasks such as industrial machine vision systems which, say, inspect bottles
speeding by on a production line, to research into artificial intelligence and computers or robots
that can comprehend the world around them. The computer vision and machine vision fields have
significant overlap. Computer vision covers the core technology of automated image analysis
which is used in many fields. Machine vision usually refers to a process of combining automated
image analysis with other methods and technologies to provide automated inspection and robot
guidance in industrial applications. In many computer-vision applications, the computers are pre-
programmed to solve a particular task, but methods based on learning are now becoming
increasingly common. Examples of applications of computer vision include systems for:
● Automatic inspection, e.g., in manufacturing applications;
● Assisting humans in identification tasks, e.g., a species identification system;
● Controlling processes, e.g., an industrial robot;

● Detecting events, e.g., for visual surveillance or people counting, e.g., in the restaurant
industry Interaction, e.g., as the input to a device for computer-human interaction;
● Modeling objects or environments, e.g., medical image analysis or topographical
modeling;
● Navigation, e.g., by an autonomous vehicle or mobile robot; and
● Organizing information, e.g., for indexing databases of images and image sequences.
The classical problem in computer vision, image processing, and machine vision is that of
determining whether or not the image data contains some specific object, feature, or activity.
Different varieties of the recognition problem are described in the literature
● Object recognition (also called object classification) – one or several pre-specified
or learned objects or object classes can be recognized, usually together with their 2D
positions in the image or 3D poses in the scene. Blippar, Google Goggles and LikeThat
provide stand-alone programs that illustrate this functionality.

● Identification – an individual instance of an object is recognized. Examples include


identification of a specific person's face or fingerprint, identification of handwritten
digits, or identification of a specific vehicle.

29
● Detection – the image data are scanned for a specific condition. Examples include
detection of possible abnormal cells or tissues in medical images or detection of a
vehicle in an automatic road toll system. Detection based on relatively simple and fast
computations is sometimes used for finding smaller regions of interesting image data
which can be further analyzed by more computationally demanding techniques to
produce a correct interpretation.
Currently, the best algorithms for such tasks are based on convolutional neural networks. An
illustration of their capabilities is given by the ImageNet Large Scale Visual Recognition
Challenge; this is a benchmark in object classification and detection, with millions of images and
hundreds of object classes. Performance of convolutional neural networks, on the ImageNet tests,
is now close to that of humans.[26] The best algorithms still struggle with objects that are small or
thin, such as a small ant on a stem of a flower or a person holding a quill in their hand. They also
have trouble with images that have been distorted with filters (an increasingly common
phenomenon with modern digital cameras). By contrast, those kinds of images rarely trouble
humans. Humans, however, tend to have trouble with other issues. For example, they are not good
at classifying objects into fine-grained classes, such as the particular breed of dog or species of
bird, whereas convolutional neural networks handle this with ease.
Several specialized tasks based on recognition exist, such as:
● Content-based image retrieval – finding all images in a larger set of images which
have a specific content. The content can be specified in different ways, for example in
terms of similarity relative a target image (give me all images similar to image X), or
in terms of high-level search criteria given as text input (give me all images which
contain many houses, are taken during winter, and have no cars in them).

Computer vision for people counter purposes in public places, malls, shopping centres

● Pose estimation – estimating the position or orientation of a specific object relative to


the camera. An example application for this technique would be assisting a robot arm
in retrieving objects from a conveyor belt in an assembly line situation or picking parts
from a bin.
● Optical character recognition (OCR) – identifying characters in images of printed or
handwritten text, usually with a view to encoding the text in a format more amenable
to editing or indexing (e.g. ASCII).

30
● 2D code reading – reading of 2D codes such as data matrix and QR codes.

● Facial recognition
● Shape Recognition Technology (SRT) in people counter systems differentiating
human beings (head and shoulder patterns) from objects
The aim of image restoration is the removal of noise (sensor noise, motion blur, etc.) from images.
The simplest possible approach for noise removal is various types of filters such as low-pass filters
or median filters. More sophisticated methods assume a model of how the local image structures
look, to distinguish them from noise. By first analysing the image data in terms of the local image
structures, such as lines or edges, and then controlling the filtering based on local information from
the analysis step, a better level of noise removal is usually obtained compared to the simpler
approaches.
An example in this field is inpainting.
5.1.2 Computer vision toolbox in matlab

Computer Vision Toolbox™ provides algorithms, functions, and apps for designing and testing
computer vision, 3D vision, and video processing systems. You can perform object detection and
tracking, as well as feature detection, extraction, and matching. For 3D vision, the toolbox supports
single, stereo, and fisheye camera calibration; stereo vision; 3D reconstruction; and lidar and 3D
point cloud processing. Computer vision apps automate ground truth labeling and camera
calibration workflows.You can train custom object detectors using deep learning and machine
learning algorithms such as YOLO v2, Faster R-CNN, and ACF. For semantic segmentation you
can use deep learning algorithms such as SegNet, U-Net, and DeepLab. Pretrained models let you
detect faces, pedestrians, and other common objects.

You can accelerate your algorithms by running them on multicore processors and GPUs. Most
toolbox algorithms support C/C++ code generation for integrating with existing code, desktop
prototyping, and embedded vision system deployment.

5.2 Image processing


In computer science, digital image processing is the use of computer algorithms to perform image
processing on digital images. As a subcategory or field of digital signal processing, digital image
processing has many advantages over analog image processing. It allows a much wider range of
algorithms to be applied to the input data and can avoid problems such as the build-up of noise
and signal distortion during processing. Since images are defined over two dimensions (perhaps
more) digital image processing may be modeled in the form of multidimensional systems. The
generation and development of digital image processing are mainly affected by three factors: first,
the development of computers; second, the development of mathematics (especially the creation
31
and improvement of discrete mathematics theory); third, the demand for a wide range of
applications in environment, agriculture, military, industry and medical science has increased.

5.3 Image sensors


The basis for modern image sensors is metal-oxide-semiconductor (MOS) technology, which
originates from the invention of the MOSFET (MOS field-effect transistor) by Mohamed M. Atalla
and Dawon Kahng at Bell Labs in 1959. This led to the development of digital semiconductor
image sensors, including the charge-coupled device (CCD) and later the CMOS sensor.
The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in
1969. While researching MOS technology, they realized that an electric charge was the analogy of
the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly
straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage
to them so that the charge could be stepped along from one to the next. The CCD is a
semiconductor circuit that was later used in the first digital video cameras for television
broadcasting
The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s.
This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling
reaching smaller micron and then sub-micron levels. The NMOS APS was fabricated by Tsutomu
Nakamura's team at Olympus in 1985. The CMOS active-pixel sensor (CMOS sensor) was later
developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993.By 2007, sales
of CMOS sensors had surpassed CCD sensors.
5.4 Image compression
An important development in digital image compression technology was the discrete cosine
transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972 DCT
compression became the basis for JPEG, which was introduced by the Joint Photographic Experts
Group in 1992. JPEG compresses images down to much smaller file sizes, and has become the
most widely used image file format on the Internet. Its highly efficient DCT compression algorithm
was largely responsible for the wide proliferation of digital images and digital photos, with several
billion JPEG images produced every day as of 2015.

5.5 Digital signal processor (DSP)


Electronic signal processing was revolutionized by the wide adoption of MOS technology in the
1970s. MOS integrated circuit technology was the basis for the first single-chip microprocessors
and microcontrollers in the early 1970s, and then the first single-chip digital signal processor
(DSP) chips in the late 1970s. DSP chips have since been widely used in digital image processing.
The discrete cosine transform (DCT) image compression algorithm has been widely implemented
in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are
widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals,

32
signaling, analog-to-digital conversion, formatting luminance and color differences, and color
formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as
motion estimation, motion compensation, inter-frame prediction, quantization, perceptual
weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such
as the inverse operation between different color formats (YIQ, YUV and RGB) for display
purposes. DCTs are also commonly used for high-definition television (HDTV) encoder/decoder
chips.

5.5.1 Medical imaging


In 1972, the engineer from British company EMI Housfield invented the X-ray computed
tomography device for head diagnosis, which is what we usually called CT(Computer
Tomography). The CT nucleus method is based on the projection of the human head section and
is processed by computer to reconstruct the cross-sectional image, which is called image
reconstruction. In 1975, EMI successfully developed a CT device for the whole body, which
obtained a clear tomographic image of various parts of the human body. In 1979, this diagnostic
technique won the Nobel Prize. Digital image processing technology for medical applications was
inducted into the Space Foundation Space Technology Hall of Fame in 1994.
Image processing toolbox in matlab
Image Processing Toolbox™ provides a comprehensive set of reference-standard algorithms and
workflow apps for image processing, analysis, visualization, and algorithm development. You can
perform image segmentation, image enhancement, noise reduction, geometric transformations,
image registration, and 3D image processing.

Image Processing Toolbox apps let you automate common image processing workflows. You can
interactively segment image data, compare image registration techniques, and batch-process large
data sets. Visualization functions and apps let you explore images, 3D volumes, and videos; adjust
contrast; create histograms; and manipulate regions of interest (ROIs).

You can accelerate your algorithms by running them on multicore processors and GPUs. Many
toolbox functions support C/C++ code generation for desktop prototyping and embedded vision
system deployment.

5.6 Image Acquisition toolbox in matlab


Image Acquisition Toolbox™ provides functions and blocks for connecting cameras and lidar
sensors to MATLAB® and Simulink®. It includes a MATLAB app that lets you interactively
detect and configure hardware properties. You can then generate equivalent MATLAB code to
automate your acquisition in future sessions. The toolbox enables acquisition modes such as

33
processing in-the-loop, hardware triggering, background acquisition, and synchronizing
acquisition across multiple devices.

Image Acquisition Toolbox supports all major standards and hardware vendors, including USB3
Vision, GigE Vision®, and GenICam™ GenTL. You can connect to Velodyne LiDAR® sensors,
machine vision cameras, and frame grabbers, as well as high-end scientific and industrial devices.

CHAPTER 6
MATLAB IMPLEMENTATION

Different toolboxes have been explored for functions and objects which can be useful at various
levels in the object detection. All such functions/ objects are described in this
Section.

34
6.1 Video Input
Input video has two possible ways Stored Video and real time video. Stored video can be obtained
from standard dataset available from internet. Real time video includescamera continuously
monitoring specific area producing real time video. These video can be understood by MATLAB
after reading.

6.1.1 Stored Video


Some commonly used standard video dataset are as follows Wallflower Dataset [4]: It is provided
by Toyama et al[]and contains seven canonical sequences with different background situation.
PETS Dataset: ”Performance Evaluation of Trackingand Surveillance” (PETS) consist of various
datasets like PETS 2001, PETS 2003 and PETS 2006. They are more useful for tracking evaluation
rather for Background. ChangeDetection.net Dataset[5]: The CDW29 dataset presents a realistic
video dataset consisting of 31 video sequence which are categorized in 6 different challenges.
Color and Thermal IR type of video included in dataset.

BMC 2012 Dataset[6]: This dataset include real and synthetic video. It is mainly used for
comparison of different background subtraction techniques .Fish4 knowledge Dataset[7]: The
Fish4 knowledge 35dataset is an underwater benchmark dataset for target detection against
complex background.Carnegie Mellon Dataset[8]: The sequence of CMU25by Sheikh and Shah
involves a camera mounted on a tall tripod. The wind caused the tripod to sway back and forth
causing vibration in the scene.

This dataset is useful while studying camera jitter back ground situation.Stored video need to be
read in appropriate format before processing. Various related functions from image processing(IP)
and computer vision(CV) toolbox can be used for this purpose.

Toolbox Object/Function Function Name Use

Image Function imread Read image from


processing graphics file

35
Image Function iminfo Information about
processing graphics file

Image Function imwrite Write image to


processing graphics file

Image Function imshow Display image


processing

Computer Object vision.Video File Read video frames


Reader
vision and audio samples
from video file

Computer Object vision.Video Write video frames


vision File Writer and audio samples
to video file

Computer Object vision.Video Play video or


vision Player display image

6.1.2 Real Time Video

36
Image acquisition is widely used toolbox which allows real time acquisition of video from video
acquisition device.Some commonly used function are explained below

Imaqtool:It launches an interactive GUI and allowsusersto explore, configure, and acquire data
from image acquisition devices.

Videoinput: It can be used to create video input object.This object can further be used to acquire
and display the image sequences.

Propinfo: It captures all the property information about image acquisition object. This information
can be useful in further video processing.

Getsnapshot: It immediately returns one single imageframe, from the video input object. This
function is useful to capture image at critical moment.

Trigger: Initiates data logging for the video inputobject. It can be used to initialize video at
appropriate moment and collect a video data.

Triggerconfig: User can configure trigger properties of video input object.

6.2 PreProcessing :
Data preprocessing is an important step in the data mining process. The phrase "garbage in,
garbage out" is particularly applicable to data mining and machine learning projects. Data-
gathering methods are often loosely controlled, resulting in out-of-range values (e.g., Income:
−100). Analyzing data that has not been carefully screened for such problems can produce
misleading results. Thus, the representation and quality of data is first and foremost before running
an analysis.Often, data preprocessing is the most important phase of a machine learning project,
especially in computational biology.
If there is much irrelevant and redundant information present or noisy and unreliable data, then
knowledge discovery during the training phase is more difficult. Data preparation and filtering
steps can take considerable amount of processing time. Data preprocessing includes cleaning,
37
Instance selection, normalization, transformation, feature extraction and selection, etc. The product
of data preprocessing is the final training set.
Data pre-processing may affect the way in which outcomes of the final data processing can be
interpreted. This aspect should be carefully considered when interpretation of the results is a key
point, such in the multivariate processing of chemical data (chemometrics).

Task of PreProcessing
● Data cleansing
● Data editing
● Data reduction

● Data wrangling

Preprocessing may include series of operation those are shown :

6.2.1 Video Type Conversion

The video is needed to be converting to appropriate data type after reading. Useful objects and
functions are listed
Useful function/object for video data type conversion

Tool box Function/object Name Use

CV object vision.Image Data Converts and scales


Type Converter an input image to a
specified output data
type. OutputData
type may
includedouble,
single,int8,uint8,int1
6,uint16,boolean,
Custom

38
IP Function Im2doubl, im2single, These function can be
used to convert image
im2 uint8,
to specified form
im2uint16

6.2.2 Video Enhancement :

This step may include noise removal, contrast adjustment,image correction. Useful function and
object summarized

Toolbox Function/ Name Use


object

CV Object vision.Median Filter 2Dmedian filtering


(to remove

Salt and
peppernoise)

39
CV Object vision.Image Filter Perform 2-D FIR
filtering of input
matrix

CV Object vision.Contrast Adjust image


contrast
Adjuster
by linear scaling

CV Object vision.Histogram Enhance contrast of


Equalizer images using
histogram
equalization

IP Function imadjust Adjust image


intensity values or
colormap

IP Function imcontrast Adjust Contrast tool

40
IP Function histeq Enhance contrast
using histogram
equalization

6.2.3 Feature Extraction

Any object detection system performs segmentation based on one or more feature of the scene. It
may include color,corner, edge, shape, gradient, texture, DCT or DFT coefficient. Different
functions are available to extract these
Useful function/object for feature extraction

Toolbox Function/object Name

IP Function rgb2gray

IP Function rgb2ycbcr

IP Function ycbcr2rgb

IP Function corner

IP Function edge

IP Function imgradient

IP Function entropyfiit

IP Function rangefilt

41
IP Function stdfilt

CV Object vision.ColorSpace
Converter

CV Object vision.DCT

CV Object vision.FFT

CV Object vision.EdgeDetector

6.3 Step by Step Process:


Different toolboxes have been explored for functions and objects which can be useful atvarious
levels in the object detection. All such functions/objects are described in this section…

STEP 1: INPUT…..to store the input…stored input need to be read in appropriate format before
processing. Various related functions from image processing and computer vision toolbox can be
used for this purpose.Some example functions are “imread, iminfo, imwrite, imshow”. These
functions are used to read, to write, to get information and to display the image.
STEP 2: PREPROCESSING….RGB to Gray conversion and gaussian noise removal using
median filter. It includes series of operations those are shown..

 Input type conversion


 Enhancement
 Feature Extraction
STEP 3: Object Detection….various object detection methods being used to detect object. These
methods are classified based on Template, motion, classifier, feature. Computer vision toolbox
includes some predefined objects which can be useful to implement these object detection
methods. Some of the functions are..

 vision.scadeobjectDetector
 vision.OpticalFlow
42
 vision.PeopleDetector
STEP 4: Post Processing…..it is required to remove unwanted portion in the foreground mask. It
mayarise due to false detection caused by dynamic may include speckle noise, small holes in the
scene etc. Detected object can be annoted for proper display. Some of the useful functions in this
process are…

 imclose
 imopen
 imfill

43
CHAPTER 7
SOURCE CODE

clc
%% Test Two
%%Histogram of Orientated Gradients
%%Histogram of Pixel Orientation
%%Histogram of curvatures
%% Eccentricity
%clear all
close all
%% Area Ratios Weight
tic
load ('newData.mat')%read video
%load('FinalHog.mat')
depth=6;
Params = [9 3 2 1 0.2];

video=mmreader('F:\Thesis\Testing Videos\T4.h64');%\Other
Datasets\test_videos\test_videos\3.avi');
%[7.5 18.5 345 224]));
for k=3501:5:4000
%%Read an image
figure(1);
image=imcrop(read(video,k),[7.5 18.5 345 224]);
44
%%Gray Scale image
img=(rgb2gray(image));
BW=edge(sqrt(double(img)),'canny',0.29);
% [x,y]=find(BW);
% deri= diff([x y],2) ;
% ind=find(deri(:,1)~=0&deri(:,2)~=0);
img1=sqrt(double(img))-sqrt(double(fi));
fore=zeros(size(img1));
ind=find(img1>max(max(img1)*0.6));
fore(ind)=255;
%BW=abs(edgelinking2_C(BW,3,3));
[BW AngleLeft AngleRight]= edgelinking2_C(BW,3,3);
BW=abs(BW);
st=strel('disk',3);
BW=imopen(BW,st);
fore=imdilate(fore,st);
LabelsList=unique(BW(ind));
toc
hold off;
figure(2);
subplot(2,2,1),imshow(BW);
subplot(2,2,2),imshow(fore);
subplot(2,2,3:4),imshow(image);hold on;
maximumLabel=max(max(BW));
%Find Properties of Connected Components
% for all those contours whose area is less than 20 and greater the 150
for i=2:numel(LabelsList)
45
[x_A,y_A]=find(BW==LabelsList(i));
px=x_A;
py=y_A;
if(~isempty(px))

box= boundingBox([y_A x_A]);


height=box(4)-box(3);
width=box(2)-box(1);

subImage=imcrop(img,[box(1) box(3) (box(2)-box(1)) box(4)-box(3)]);


subImage=imresize(subImage,[35 20]);

if(~isempty(subImage))

hogs = HoG(double(subImage),Params);
% subImage=edge(subImage,'canny');
% r = regionstat(double(subImage), 1, 'Extent');
%
Z=[x_A-median(x_A),y_A-median(y_A)];
C=cov(Z);
[E,V]=eig(C);
V=sort(diag(V));
stra=V(2)/sum(V);
ell = inertiaEllipse([x_A y_A]);
OtherFeatures=[ell(4)/ell(3);ell(5);stra];%;AngleRight(i);stra];
46
H=[hogs;OtherFeatures];%;length(unique(x_A));r;((box(2)-box(1))/height);stra];

[predict_label, accuracy, prob_estimates] = svmpredict(1,H', svmmodel);

drawBox(box,'g');

end

%%Result
end
end
saveas(figure(2),strcat('Results\T4_',num2str(k),'.jpg'));
time=toc;
fi=rgb2gray(image);
end
% title('subImage');
%saveit = close(saveit);

CHAPTER 8
RESULT

47
Object Detection is observed

CHAPTER 9
APPLICATIONS

Here we can discuss some current and future Applications in detail.

1. OPTICAL CHARACTER RECOGNITION

Optical character recognition or optical character reader, often abbreviated as OCR, is the
mechanical or electronic conversion of images of typed, handwritten or printed text into machine-
encoded text, whether from a scanned document, a photo of a document, a scene-photo (for
example the text on signs and billboards in a landscape photo) or from subtitle text superimposed
on an image, we are extracting characters from the image or video.

48
Widely used as a form of information entry from printed paper data records – whether passport
documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of
static-data, or any suitable documentation it is a common method of digitizing printed texts so that
they can be electronically edited, searched, stored more compactly, displayed on-line, and used in
machine processes such as cognitive computing, machine translation, (extracted) text-to-speech.

SELF DRIVING CARS

One of the best examples of why you need object detection is for autonomous driving is In order
for a car to decide what to do in next step whether accelerate, apply brakes or turn, it needs to
know where all the objects are around the car and what those objects are That requires object
detection and we would essentially train the car to detect known set of objects such as cars,
pedestrians, traffic lights, road signs, bicycles,motorcycles, etc.

TRACKING OBJECTS

49
Object detection system is also used in tracking the objects, for example tracking a ball during a
football match, tracking movement of a cricket bat, tracking a person in a video.

Object tracking has a variety of uses, some of which are surveillance and security, traffic
monitoring, video communication, robot vision and animation.

4. FACE DETECTION AND FACE RECOGNITION


Face detection and Face Recognition is widely used in computer vision task. We noticed how
facebook detects our face when you upload a photo This is a simple application of object detection
that we see in our daily life.Face detection can be regarded as a specific case of object-class
detection. In object-class detection, the task is to find the locations and sizes of all objects in an
image that belong to a given class. Examples include upper torsos, pedestrians, and cars.

Face detection is a computer technology being used in a variety of applications that identifies
human faces in digital images. Face recognition describes a biometric technology that goes way
beyond recognizing when a human face is present. It actually attempts to establish whose face it
is.

There are lots of applications of face recognition. Face recognition is already being used to unlock
phones and specific applications. Face recognition is also used for biometric surveillance, Banks,
retail stores, stadiums, airports and other facilities use facial recognition to reduce crime and
prevent violence.
50
SMILE DETECTION

Facial expression analysis plays a key role in analyzing emotions and human behaviors. Smile
detection is a special task in facial expression analysis with various potential applications such as
photo selection, user experience analysis and patient monitoring.

PEDESTRIAN DETECTION

Pedestrian detection is an essential and significant task in any intelligent video survillance system,
as it provides the fundamental information for semantic understanding of the video footages. It has
an obvious extension to automotive applications due to the potential for improving safety systems.

BALL TRACKING IN SPORTS

Increase in the number of sport lovers in games like football, cricket, etc. has created a need for
digging, analyzing and presenting more and more multidimensional information to them. Different
classes of people require different kinds of information and this expands the space and scale of the
required information. Tracking of ball movement is of utmost importance for extracting any
information from the ball based sports video sequences and we can record the video frame
according to the movement of the ball automatically.

OBJECT RECOGNITION AS IMAGE SEARCH

By Recognizing the objects in the images ,combining each object in the image and passing detected
objects label in the URL we can make the object detection system as image search.

51
AUTOMATIC TARGET RECOGNITION

Automatic target recognition (ATR) is the ability for an algorithm or device to recognize targets
or other objects based on data obtained from sensors.

Target recognition was initially done by using an audible representation of the received signal,
where a trained operator who would decipher that sound to classify the target illuminated by the
radar. While these trained operators had success, automated methods have been developed and
continue to be developed that allow for more accuracy and speed in classification. ATR can be
used to identify man made objects such as ground and air vehicles as well as for biological targets
such as animals, humans, and vegetative clutter. This can be useful for everything from
recognizing an object on a battlefield to filtering out interference caused by large flocks of birds
on Doppler weather radar.

CHAPTER 10
CONCLUSION AND FUTURE SCOPE

10.1 DISCUSSION AND CONCLUSION


This report presents the basic object detection system. MATLAB platform(MATLAB 2012) is
used to carry implementation of the system. Different Toolboxes has been explored and useful
MATLAB functions and objects are collected which can be useful at various stages. Toolboxes
mainly includes image acquisition, image processing and computer vision. Sample MATLAB
coding is presented for object detection. Each stage in the system has been implemented by
52
available functions/objects in toolbox. It shows that implementation is easy and code is being short
due to use of predefined objects/functions in MATLAB. This study may help new student and
research in this field to study, implement and experiment established research.

10.2 FUTURE SCOPE:


Some steps in the object tracking process are mostly done manually, feature selection is one
example. The accuracy of object tracking could potentially increase by developing methods for a
more automatic selection process of features. We know from experience that a human tends do
make more mistake than a computer program optimized for a certain purpose. Automatic feature
selection has received attention in the area of pattern recognition, where methods for this purpose
are divided into filter methods and wrapper methods [48]. However, these have not gotten the same
attention in the area of object tracking, where feature selection still is mostly done manually. There
could be room for improvement in object tracking by developing fast and accurate methods for
automatic feature selection. A suitable continuation of a work like this thesis would be to make an
easy, comprehensible summary over the most common object tracking algorithms, thus making an
extension to this work.

CHAPTER 11
REFERENCES

 Video Analytics: http://www.dspdesignline.com/videoanalytics.html


 https://in.mathworks.com/matlabcentral/fileexchange/54092-object-detection
 https://en.wikipedia.org/wiki/Object_detection
 Shireen Y. Elhabian, Khaled M. El-Sayed, Moving Object Detection in Spatial Domain
using Background Removal Techniques
 Jun-Wei Hsieh, Shih-Hao Yu, Yung-Sheng Chen, An Automatic Traffic Surveillance
System for Vehicle Tracking and Classification, IEEE Transactions on Intelligent
Transportation Systems, Vol.

53
54

You might also like