Face Mask Detection
Face Mask Detection
1520
Justin Ipe Abraham et al., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 1520 – 1523
circuits such as the Central Processing Unit( CPU), the performed with the help of a fully convolutional network.
Graphics Processing Unit (GPU), input, and output are This paper with the twin objective of creating a Binary face
carried by a single circuit board. The GPIO pins provide an classifier that can detect faces in any orientation irrespective
essential element to help enable the RPi to be accessible to of alignment and train it in an appropriate neural network to
hardware programming for controlling electronic circuits and get accurate results. The model requires inputting an RGB
data processing on input/output devices. Add a power image of any arbitrary size to the model. The model's basic
adapter, keyboard, mouse, and monitor that works on the function is feature extraction and class prediction. The output
Raspberry Pi in compliance with the HDMI connector. New of the model is a feature vector that is optimized using
models are available to interact via WiFi to the internet. The Gradient descent and the loss function used is Binomial Cross
RPi can be run using the Raspbian operating system. Entropy. a method of obtaining segmentation masks directly
from the images containing one or more faces in different
The first step of the real-time facemask recognition system is orientations. The input image of any arbitrary size is resized
image acquisition. High-quality images of the person posing to 224 × 224 × 3 and fed to the FCN network for feature
with facemask wearing and not wearing facemask are Face extraction and prediction. The output of the network is then
Mask Detection obtained through digital cameras, cellphone subjected to post-processing. Initially, the pixel values of the
cameras, or scanners. A Knowledge-based dataset is created face and background are subjected to global thresholding.
by proper labeling of the captured images with unique classes. After that, it is passed through a median filter to remove the
The obtained images that will be engaged in a preprocessing high-frequency noise and then subjected to a Closing
step are further enhanced specifically for image features operation to fill the gaps in the segmented area. After this
during processing. The segmentation process divides the bounding box is drawn around the segmented area. the model
images into several segments and is utilized in the extraction consists of a total of 17 convolutional layers and 5
of facemask-covered areas in the person's face from the max-pooling layers. The initial image size which is fed to the
background. Feature-Extraction section involves the model is 224×224×3. As the image is processed through the
coevolutionary layers that obtain image features from the layers for feature extraction it's passed through convolutional
resize images and is also joined after each convolution with layers and max-pooling layers. t the gaps in the segmented
the ReLU. Max and average pooling of the feature extraction region are filled and most of the unwanted false erroneous
decreases the size. Ultimately, both the convolutional and the prediction removed. All the experiments have been performed
pooling layers act as purifiers to generate those image on Multi Human Parsing Dataset containing about 5000
characteristics. The final step is to classify images, to train images, each with at least two persons. Out of these, 2500
deep learning models along with the labeled images to be images were used for training and validation while the
trained on how to recognize and classify images according to remaining were used for testing the model. this model has
learned visual patterns. The authors used an open-source also shown great results in recognizing non-frontal faces.
implementation via the TensorFlow module, using Python Along with this, it is also able to detect multiple facial masks
and OpenCV including the VGG-16 CNN model. supervised in a single frame. The post-processing provides a large boost
model of learning, with training and test sets, divided to 80% to pixel-level accuracy. The mean pixel-level accuracy for
for instruction and 20% for research. Three metrics were used facial masks 93.884%.
to measure the model's performance: accuracy, training time,
and learning error. In the conduct of experiments, the input Joshua Gisemba Okemwa Victor Mageto2[3] proposed the
parameters were set equal to 224 according to its input image project "Using CNN and HOG Classifier to Improve Facial
width and height, batch size during training is set to 64 Expression Recognition". Facial expression recognition
images and 100 iterations is set to the number of epochs. (FER) is growing in a large scope due to the diversification of
ADAM optimizer with a learning rate of 0.0001 is set for its field of application. FER is now applicable in crime
optimization. The study applies 12,500 images per class and prevention, smart city development, as well as other economic
this data is enough to train a deep learning model. The 96% sectors like transportation, advertising, and health. Many
validation accuracy was achieved during the training of the feature extraction methods and classification techniques have
CNN model. previously been developed to give better accuracy and
performance in face recognition. A convolution neural
Toshanlal Meenpal, Ashutosh Balakrishnan, and Amit network CNN is an unsupervised deep learning algorithm
Verma [2] proposed the project "Facial Mask Detection using with the ability to learn image characteristics and make
Semantic Segmentation". This paper proposes a model for differentiation of one aspect to another. We applied HOG
face detection using semantic segmentation in an image by classifier for feature extraction and CNN to detect and classify
classifying each pixel as a face and non-face i.e effectively the expressions. Overall we achieved high accuracy and
creating a binary classifier and then detecting that segmented optimization results of 77.2%. This method achieved higher
area. The model works very well not only for images having results than previous work is done using the SVM algorithm
frontal faces but also for non-frontal faces. The paper also and HOG classifier with an accuracy of 55%. The dataset was
focuses on removing the erroneous predictions which are selected for analysis which contained 35887 images. This
bound to occur. Semantic segmentation of the human face is open-source dataset was created by Pierre Luc Carrier and
1521
Justin Ipe Abraham et al., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 1520 – 1523
Aaron Courville for the Kaggle competition in 2013. The 5. TEST REPORTS/COMPARISON REPORTS AND
dataset is made up of 48x48 sized grayscale images with 7 ACCURACY
different emotions. The expressions present include anger,
sadness, happiness, surprise, neutral, disgust, and fear. The experimental results of the system performance are
Emotions such as Angry, Disgust, Fear, Happy, Sad, Surprise, evaluated with the following classifiers and optimizer.
and Neutral are given indexes as 0, 1, 2, 3,4, 5, and 6. These Table 1: Results of the proposed system with Haar cascade
images were further classified into two groups; the training classifier:
phase and the testing phase. It was done to verify the
Classifier Filter Optimizer Train Train Test Test
performance and accuracy of our results.
Loss gain Loss gain
3. PROPOSED SYSTEM
The proposed system can be integrated with CCTV cameras
and that data may be administered to see if people are wearing Haar No Filter ADAM 0.2033 0.9960 0.0138 0.9869
masks. When the person appears at the entrance. This person Cascade used
might be wearing a mask or not wearing the mask. The CCTV
camera looks for the faces and detects persons without masks. From the Table 1 it is observed that performance of ADAM
That person will be denied access and he/she could see a optimizer is good in both training and testing.
message appearing on the screen or panel showing some kind Table 2: Results of the proposed system with HaarCascade
of alert message. The access will be denied till he/she wear a classifier and Bilateral filter:
mask. The authorities will be alerted via an email in real-time
if the person is not wearing the mask. Classifier Filter Optimizer Train Train Test Test
Loss gain Loss gain
4. FLOW CHART
6. METHODOLOGY
1523