B2
B2
B2
Table contents
Introduction ……………………………………………………………….. xi
Abstraction …………………………………………………………………...xii
Objective …………………………………………………………………..xiii
Chapterone
Introduction
In recent years, face recognition has attracted much attention and its research has rapidly expanded
by not only engineers but also neuroscientists, since it has many potential applications in computer
vision communication and automatic access control system. Especially, face detection is an
important part of face recognition as the first step of automatic face recognition. However, face
detection is not straightforward because it has lots of variations of image appearance, such as pose
variation (front, non-front), occlusion, image orientation, illuminating condition and facial
expression.
Many novel methods have been proposed to resolve each variation listed above. For
example, the template-matching methods [1], [2] are used for face localization and detection by
computing the correlation of an input image to a standard face pattern. The feature invariant
approaches are used for feature detection [3], [4] of eyes, mouth, ears, nose, etc. The appearance-
based methods are used for face detection with eigenface [5], [6], [7], neural network [8], [9], and
information theoretical approach [10], [11]. Nevertheless, implementing the methods altogether is
still a great challenge. Fortunately, the images used in this project have some degree of uniformity
thus the detection algorithm can be simpler: first, the all the faces are vertical and have frontal
view; second, they are under almost the same illuminate condition. This project presents a face
detection technique mainly based on the color segmentation, image segmentation and template
matching methods.
Page 6
ABSTRACT
Face affirmation from a video is a standard subject in biometrics investigation. Face affirmation
development has commonly stood apart due to its huge application worth and market potential, for
instance, a nonstop video surveillance structure. It is comprehensively perceived that the face
affirmation has expected a huge activity in perception system as it needn't waste time with the
article's co-action. We plan a persistent face affirmation system subject to IP camera and picture set
figuring by technique for OpenCV and Python programming improvement. The system fuses three
segments: Detection module, planning module, and affirmation module. This paper gives capable
and amazing estimations to constant face recognizable proof and affirmation in complex
establishments. The figurings are executed using a movement of sign planning techniques
including Local Binary Pattern (LBP), Haar Cascade feature. The LBPH figuring is utilized to
evacuate facial features for brisk face ID. The eye revelation count reduces the fake face
distinguishing proof rate. The recognized facial picture is then arranged to address the heading and
addition the separation, along these lines, keeps up high facial affirmation precision. Colossal
databases with faces and non-faces pictures are used to get ready and endorse face revelation and
facial affirmation counts. The estimations achieve a general veritable positive pace of 98.8% for
the face area and 99.2% for right facial affirmation.
Page 7
OBJECTIVE:
Whenever we implement a new system it is developed to remove the shortcomings of the
existing system. The computerized mechanism has the more edge than the manual system. The
existing system is based on manual system which takes a lot of time to get performance of the
work. The proposed system is a web application and maintains a centralized repository of all
related information. The system allows one to easily access the software and detect what he wants.
Page 8
Detection of skin color in color images is a very popular and useful technique for face detection.
Many techniques [12], [13] have reported for locating skin color regions in the input image. While
the input color image is typically in the RGB format, these techniques usually use color
components in the color space, such as the HSV or YIQ formats. That is because RGB
components are subject to the lighting conditions thus the face detection may fail if the lighting
condition changes. Among many color spaces, this project used YCbCr components since it is one
of existing Matlab functions thus would save the computation time. In the YCbCr color space, the
luminance information is contained in Y component; and, the chrominance information is in Cb
and Cr. Therefore, the luminance information can be easily de-embedded. The RGB components
were converted to the YCbCr components using the following formula.
Y = 0.299R + 0.587G + 0.114B
Page 10
Cb = -0.169R - 0.332G + 0.500B
Cr = 0.500R - 0.419G - 0.081B
In the skin color detection process, each pixel was classified as skin or non-skin based on
its color components. The detection window for skin color was determined based on the mean and
standard deviation of Cb and Cr component, obtained using 164 training faces in 7 input images.
The Cb and Cr components of 164 faces are plotted in the color space in Fig.1; their histogram
distribution is shown in Fig. 2.
Fig.
2 (a) Histogram distribution of Cb. (b) Histogram distribution of Cr.
The color segmentation has been applied to a training image and its result is shown in Fig.
3. Some non-skin objects are inevitably observed in the result as their colors fall into the skin color
space.
Page 11
Chapter Two
2.1 Image segmentation
The next step is to separate the image blobs in the color filtered binary image into
individual regions. The process consists of three steps. The first step is to fill up black isolated
holes and to remove white isolated regions which are smaller than the minimum face area in
training images. The threshold (170 pixels) is set conservatively. The filtered image followed by
initial erosion only leaves the white regions with reasonable areas as illustrated in Fig. 4.
Secondly, to separate some integrated regions into individual faces, the Roberts Cross Edge
detection algorithm is used. The Roberts Cross Operator performs a simple, quick to compute, 2-D
spatial gradient measurement on an image. It thus highlights regions of high spatial gradients that
Page 13
often correspond to edges. (Fig. 5.)The highlighted region is converted into black lines and eroded
to connect crossly separated pixels.
Finally, the previous images are integrated into one binary image and relatively small black
and white areas are removed. The difference between this process and the initial small area
elimination is that the edges connected to black areas remain even after filtering. And those edges
play important roles as boundaries between face areas after erosion. Fig. 6. shows the final binary
images and some candidate spots that will be compared with the representative face templates in
the next step are introduced in Fig. 7.
Page 14
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii Image
imatching
2.2 Image machting
Eigenimage Generation
A set of eigenimages was generated using 106 test images which were manually cut from 7 test
images and edited in Photoshop to catch exact location of faces with a square shape. The cropped
test images were converted into gray scale, and then eigenimages were computed using those 106
test images. In order to get a generalized shape of a face, the largest 10 eigenimages in terms of
their energy densities, have been obtained as shown in the Fig. 8. To save computing time, the
information of eigenimages was compacted into one image which was acquired after averaging the
first 9 eigenimages excluding the eigenimage 1, the highest-energy one. The first image was
excluded due to its excessive energy concentration which will eliminate the details of face shapes
that can be shown from other eigenimages from eigenimage 2 to eigenimage 10. The averaged
eigenimage is shown in Fig. 9.
Eigenimages
Page 16
Fig.9. Average image using eigenimages
If Fig. 10 is examined closely, some faces are divided into several pieces, for example the face
being separated into its upper part and neck part as seen in Fig. 11 (a). This is due to the erosion
process which was applied to evade occlusion. To merge these separate areas into one area, box-
Page 17
merge algorithm was used which simply merges two or more adjacent square boxes into one. Since
this phenomenon happens between face and neck part most of times, distance threshold was set
small for horizontal direction, while set large for vertical direction. The results after merging two
boxes in Fig. 11 (a) are shown in Fig. 11 (b). After applying this algorithm, it can be found that
only one box is placed per face most of times in Fig. 12.
2.3 Correlation
The test images selected by an appropriate square window can be passed to the image matching
algorithm. Before the image matching process, the test image need to be converted to gray scale,
and should be divided by the average brightness of the image in order to eliminate the effect of the
brightness of the test image in the process of image matching. Average brightness was defines as
2nd norm of the skin-colored area of the test image. Note that it is not the 2nd norm applied to the
total area of the test image, since the value that we are looking for is not the average brightness of
the test image, but the average brightness of the skin colored parts only.
With the normalized test image, the image matching can be simply accomplished by loading a
correspondent file of eigenimage from the database, then performing correlation of the test image
with respect to the loaded eigenimage. The results of image matching are illustrated in Fig. 13. The
number inside each window means the ranking of the correlation value.
Page 19
Training_1.jpg 21 19 0 0 111
Training_2.jpg 24 24 0 1 101
Training_3.jpg 25 23 0 1 89
Training_4.jpg 24 21 0 1 84
Training_5.jpg 24 22 0 0 93
Training_6.jpg 24 22 0 3 100
Training_7.jpg 22 22 0 1 95
The face detection algorithm shows 93.3 % of right hit rate, and 0 % of repeat rate, and 4.2 % of
false hit rate. The average run time is 96 seconds.
In order to see if this algorithm works for other than the 7 training images, last year’s sample
picture was test, and the result is as shown in Fig. 17. The results show that 20. out of 24 faces
have been successfully located, and there was no repeat or false detection.
Page 23
Chapter Three
3.1 Gender Recognition
Gender recognition algorithm has been implemented to detect at most 3 females in a test photo.
Since the number of females to detect is only 3, average faces of the three females have been
calculated as shown in Fig. 18, and image matching was performed for each of these average faces.
The test images obtained from Test Image Selection Algorithm which explained in the
previous section were used to match with these three female average faces. The information about
the average faces was stored in the database using the identical method as performed for saving
eigenfaces before.
After running the algorithm for the 7 training image sets, the results are that the average face
image matching method did not detect any female faces at all, but detected the face which has the
largest correlation value for general face detection. However, the inaccuracy of the average face
image matching was expected, because in order to have selectivity of this algorithm, a test image
should be taken very precisely so that it exactly overlaps the eigenface in terms of its center
location and box size. In the real case, the box which defined the contour of a face was bigger or
smaller than it should be, and the center was hardly overlapping either.
More sophisticated algorithm will be required in order to accomplish gender recognition, or further,
the face recognition of a certain character.
In theory, the operator consists of a pair of 2×2 convolution masks as shown in Figure 1. One mask
is simply the other rotated by 90°.
Page 25
These masks are designed to respond maximally to edges running at 45° to the pixel grid, one
mask for each of the two perpendicular orientations. The masks can be applied separately to the
input image, to produce separate measurements of the gradient component in each orientation (call
these Gx and Gy). These can then be combined together to find the absolute magnitude of the
gradient at each point and the orientation of that gradient. The gradient magnitude is given by:
|G | = ( Gx2 + Gy2 ) 1/2
|G | = |Gx | + |Gy |
which is much faster to compute.
The angle of orientation of the edge giving rise to the spatial gradient (relative to the pixel grid
orientation) is given by: θ = arctan (Gy /Gx ) - 3π/4
In this case, orientation 0 is taken to mean that the direction of maximum contrast from black to
white runs from left to right on the image, and other angles are measured anti-clockwise from this.
Often, the absolute magnitude is the only output the user sees. The two components of the
gradient are conveniently computed and added in a single pass over the input image using the
pseudo-convolution operator shown in Figure 2.
|G | = |P1 – P4 | + |P2 – P3 |
% coefficients
effect_num=3;
min_face=170;
small_area=15;
imgSize = size(img);
uint8Img = uint8(img);
gray_img=rgb2gray(uint8Img);
% first erosion
filtered = imerode(filtered,ones(2*effect_num));
% second erosion
filtered=imerode(filtered,ones(effect_num));
outFaces = [];
for k=1:num_box,
ctr = boxInfo(k, 1:2);
hWdth = boxInfo(k,3);
ee368YCbCrseg.m
function ee368YCbCrseg
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%
%%%% a function for color component analysis %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%
folder_num=size(image_q);
n=0;
clear all;
Page 31
ee368YCbCrseg.m
function
result=ee368YCbCrbin(RGBimage,meanY,meanCb,meanCr,stdY,stdCb,stdCr,factor) %
ee368YCbCrbin returns binary image with skin-colored area white.
%
% Example:
% result=ee368YCbCrbin(RGBimage,meanY,meanCb,meanCr,stdY,stdCb,stdCr,factor)
% RGBimage: double formatted RGB image
% meanY: mean value of Y of skin color
% meanCb: mean value of Cb of skin color
% meanCr: mean value of Cr of skin color
% stdY: standard deviation of Y of skin color
% stdCb: standard deviation of Cb of skin color
% stdCr: standard deviation of Cr of skin color
% factor: factor determines the width of the gaussian envelop.
%
% All the parameters are based on the training facial segments taken from 7 training
images
YCbCrimage=rgb2ycbcr(RGBimage);
binImage=zeros(imag_row,imag_col);
Cb=zeros(imag_row,imag_col);
Cr=zeros(imag_row,imag_col);
result=binImage;
Page 32
ee368boxInfo
hStp = 5;
ee368boxMerge
rGapTh =
70; cGapTh
= 25; hStp =
5; rThr =
200;
adjBoxCor = [];
numAdj = size(adjBoxCor,1);
sndPnt = adjBoxCor(j,2);
sndCtrR = boxInfo(sndPnt, 1);
sndCtrC = boxInfo(sndPnt, 2);
sndHwd = boxInfo(sndPnt, 3);
adjBoxCor2 = adjBoxCor;
boxInfo(adjBoxCor2(:,1), :) = [];
boxInfo(find(boxInfo(:,1) > nRow-rThr), :) = [];
Page 35
ee368imgCut
nRow = imgSize(1);
nCol = imgSize(2);
ctr = boxInfo(1:2);
hWdth = boxInfo(3);
ee368imgMatch
lowThr = 30;
higThr = 220;
wdth =
2*hWdth;
if wdth < lowThr,
corr = 0; elseif wdth > higThr, corr = 0;
else eval(['load eigFace',
num2str(wdth)]); corr
=reshape(testImg, wdth^2, 1)'*eigFace;
end
Page 37
ee368imgMatchFe
abss. m
function val=abss(inp);
% Function 'abss' prevents the coordinate of test image
% exceeds the boundary of the original image
Page 38
if(inp>1)
val=inp;
else
val=1;
end
abss2. m
function val=abss2(inp, thr);
% Function 'abss' prevents the coordinate of test image
% exceeds the boundary of the original image
if(inp>thr)
val=thr;
else
val=inp;
end
Page 39
Chapter Four
4.1 TRAINING IN OPENCV
In OpenCV, training refers to providing a recognizer algorithm with training data to learn from.
The trainer uses the same algorithm (LBPH) to convert the images cells to histograms and then
computes the values of all cells and by concatenating the histograms, feature vectors can be
obtained. Images can be classified by processing with an ID attached. Input images are classified
using the same process and compared with the dataset and distance is obtained. By setting up a
threshold, it can be identified if it is a known or unknown face.Eigenface and Fisherface compute
the dominant features of the whole training set while LBPH analyses them individually. To do so,
firstly, a Dataset is created. You can either create your own dataset or start with one of the
available face databases. •Yale Face Database •AT & T Face Database The .xml or .yml
configuration file is made from the several features extracted from your dataset with the help of the
FaceRecognizer Class and stored in the form of feature vectors.
4.2 TRAINING THE CLASSIFIERS
OpenCV enables the creation of XML files to store features extracted from datasets using the
FaceRecognizer Class. The stored images are imported, converted to Grayscale and saved with IDs
in two lists with same indexes. Face Recognizer obj ects are created using FaceRecognizer class.
Each recognizer can take in parameters described below. cv2.face.createEigenFaceRecognizer()
1.Takes in the number of components for the PCA for crating Eigenfaces. OpenCV documentation
mentions 80 can provide satisfactory reconstruction capabilities. 21 2. Takes in the threshold in
recognising faces. If the distance to the likeliest Eigenface is above this threshold, the function will
return a -1, that can be used state the face is unrecognisable cv2.face.createFisherFaceRecognizer()
1. The first argument is the number of components for the LDA for the creation of Fisherfaces.
OpenCV mentions it to be kept 0 if uncertain. 2. Similar to Eigenface threshold. -1 if the threshold
is passed. cv2.face.createLBPHFaceRecognizer() 1. The radius from the centre pixel to build the
local binary pattern. 2. The Number of sample points to build the pattern. Having a considerable
number will slow down the computer. 3. The Number of Cells to be created in X axis. 4. The
number of cells to be created in Y axis. 5. A threshold value similar to Eigen face and Fisherface.
if the threshold is passed the object will return 1.Recogniser objects are created and images are
imported, resized, converted into numpy arrays and stored in a vector. The ID of the image is
gathered from splitting the file name, and stored in another vector.By using
FaceRecognizer.train(NumpyImage, ID) all three of the objects are trained. It must be noted that
resizing the images were required only for Eigenface and Fisherface, not for LBPH. The
configuration model is saved as XML using the function: FaceRecognizer.save(FileName).
cognizer class. The stored images are imported, converted to grayscale and saved with IDs in two
listswith same indexes. FaceRecognizer objects are created using face recogniser class.
4.2 CONCLUSION
This paper portrays the smaller than usual undertaking for visual discernment and
independence module. Next, it clarifies the advances utilized in the venture and the procedure
utilized. At last, it shows the outcomes, talks about the difficulties and how they were settled
trailed by a conversation. Utilizing Haar-falls for face recognition worked amazingly well in any
event when subjects wore exhibitions. Ongoing video speed was agreeable also without observable
casing slack. Thinking about all elements, LBPH joined with Haar-falls can be actualized as a cost
e ective face acknowledgment stage. A model is a framework to distinguish known troublemakers
in a shopping center or a market to give the proprietor an admonition to keep him alert or for
programmed participation taking in a class.
Page 42
References
[1] I. Craw, D. Tock, and A. Bennett, “Finding face features,” Proc.of 2nd European Conf.
Computer Vision. pp. 92-96, 1992.
[2] A. Lanitis, C. J. Taylor, and T. F. Cootes, “An automatic face identification system using
flexible appearance models,” Image and Vision Computing, vol.13, no.5, pp.393-401, 1995.
[3] T. K. Leung, M. C. Burl, and P. Perona, “Finding faces in cluttered scenes using random
labeled graph matching,” Proc. 5th IEEE int’l Conf. Computer Vision, pp. 637-644, 1995.
[4] B. Moghaddam and A. Pentland, “Probabilistic visual learning for object recognition,” IEEE
Trans. Pattern Analysis and Machine Intelligence, vol. 19, no.7. pp. 696-710, July, 1997.
[5] M. Turk and A. Pentland, “Eigenfaces for recognition,” J. of Cognitive Neuroscience, vol.3,
no. 1, pp. 71-86, 1991.
[6] M. Kirby and L. Sirovich, “Application of the Karhunen-Loeve procedure for the
characterization of human faces,” IEEE Trans. Pattern Analysis and Machine Intelligence,
vol.12, no.1, pp. 103-108, Jan. 1990.
[7] I. T. Jolliffe, Principal component analysis, New York: Springer-Verlag, 1986.
[8] T, Agui, Y. Kokubo, H. Nagashi, and T. Nagao, “Extraction of face recognition from
monochromatic photographs using neural networks,” Proc. 2nd Int’l Conf. Automation,
Robotics, and Computer Vision, vol.1, pp. 18.81-18.8.5, 1992.
[9] O. Bernier, M. Collobert, R. Feraud, V. Lemaried, J. E. Viallet, and D. Collobert,
“MULTRAK: A system for automatic multiperson localization and tracking in real-time,”
Proc, IEEE. Int’l Conf. Image Processing, pp. 136-140, 1998.
[10] A. J. Colmenarez and T. S. Huang, “Face detection with information-based maximum
discrimination,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 782-787,
1997.
[11] M. S. Lew, “Information theoretic view-based and modular face detection,” Proc. 2 nd Int’l
Conf. Automatic Face and Gesture Recognition, pp. 198-203, 1996.
[12] H. Martin Hunke, Locating and tracking of human faces with neural network, Master’s
thesis, University of Karlsruhe, 1994.
[13] Henry A. Rowley, Shumeet Baluja, and Takeo Kanade. “Neural network based face
detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(I), pp.23-
38, 1998.
Page 43
Page 44