Drowsiness Detection 2

SPECIAL SECTION ON ARTIFICIAL INTELLIGENCE (AI)-EMPOWERED INTELLIGENT
TRANSPORTATION SYSTEMS
Received November 26, 2019, accepted December 5, 2019, date of publication December 10, 2019,
date of current version December 23, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2958667
A Real-time Driving Drowsiness Detection

Algorithm With Individual Differences
Consideration
FENG YOU 1, XIAOLONG LI 1, YUNBO GONG 1, HAIWEI WANG 2, AND HONGYI LI 3,4
1 School of Civil Engineering and Transportation, South China University of Technology, Guangzhou 510640, China
2 School of Transportation and Economic Management, Guangdong Communication Polytechnic, Guangzhou 510650, China
3 Xinjiang Quality of Products Supervision and Inspection Institute of Technology, Urumqi 830011, China
4 Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
Corresponding authors: Haiwei Wang (whw2046@126.com) and Hongyi Li (lihy@ms.xjb.ac.cn)

This work was supported in part by the National Natural Science Foundation of China under Grant 51808151 and Grant 51408237, in part
by the Guangdong Natural Science Foundation 2020, in part by the Guangdong Provincial Public Welfare Research and Capacity Building
Special Project under Grant 2016A020223002, in part by the Guangdong Provincial Science and Technology Plan Project under Grant
2017A040405021, and in part by the Fundamental Research Funds for Guangdong Communication Polytechnic under Grant 20181014.
ABSTRACT The research work about driving drowsiness detection algorithm has great significance to
improve traffic safety. Presently, there are many fruits and literature about driving drowsiness detection
method. However, most of them are devoted to find a universal drowsiness detection method, while ignore
the individual driver differences. This paper proposes a real-time driving drowsiness detection algorithm that
considers the individual differences of driver. A deep cascaded convolutional neural network was constructed
to detect the face region, which avoids the problem of poor accuracy caused by artificial feature extraction.
Based on the Dlib toolkit, the landmarks of frontal driver facial in a frame are found. According to the
eyes landmarks, a new parameter, called Eyes Aspect Ratio, is introduced to evaluate the drowsiness of
driver in the current frame. Taking into account differences in size of driver’s eyes, the proposed algorithm
consists of two modules: offline training and online monitoring. In the first module, a unique fatigue state
classifier, based on Support Vector Machines, was trained which taking the Eyes Aspect Ratio as input. Then,
in the second module, the trained classifier is application to monitor the state of driver online. Because the
fatigue driving state is gradually produced, a variable which calculated by number of drowsy frames per unit
time is introduced to assess the drowsiness of driver. Through comparative experiments, we demonstrate
this algorithm outperforms current driving drowsiness detection approaches in both accuracy and speed.
In simulated driving applications, the proposed algorithm detects the drowsy state of driver quickly from
640∗480 resolution images at over 20fps and 94.80% accuracy. The research result can serve intelligent
transportation system, ensure driver safety and reduce the losses caused by drowsy driving.
INDEX TERMS Traffic safety, driving drowsiness detection, CNN, individual differences, SVM.
I. INTRODUCTION to the record of National Sleep Foundation, about 32% of

With the increasing number of vehicles all over the world, drivers have at least one drowsy driving experience per
traffic accidents have become one of the primary reasons to month [2]. About 100,000 accidents caused by drowsy driv-
cause human death. According to the report of the World ing and about 25% of traffic accidents involve drowsy driving
Health Organization (WHO), traffic accidents are one of the every year [3]. Driving drowsiness refers to the behavior
top ten causes of human death in 2015 [1]. As we know, of driving skills declining objectively, due to the imbalance
the driver, as the core of the road traffic system, is the of physiological functions after driving continuously for a
most significant factor affecting road traffic safety. According long time [4]. It may affect driving behavior and pose a
serious safety threat to drivers and other traffic participants.
The associate editor coordinating the review of this manuscript and To prevent sleepy driving, a series of laws and regulations
approving it for publication was Amr Tolba . are enacted all over the world. For example, the Road Traffic
179396 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ VOLUME 7, 2019
F. You et al.: Real-time Driving Drowsiness Detection Algorithm With Individual Differences Consideration
Safety Law of China stipulates that the driver must not run a algorithm based on deep learning technology while ignore the
vehicle continuously for more than 4 hours [5]. By measur- specific application sometimes.
ing driving time to judge the state of drivers, it can reduce Percentage of Eyelid Closure Over the Pupil Over
the traffic accidents caused by driving drowsiness to some Time(PERCLOS) [22] is one of the most popular parame-
extent. However, this method cannot detect whether a driver is ter, which applies to driving drowsiness detection based on
fatigued or not in real-time. With the development of informa- computer vision. It firstly assesses whether the driver’s eyes
tion technology, detection system for driving drowsiness has are open or closed in the current frame according to the
become an alternative means to solve the problem. Therefore, proportion of eyelid covering the pupil. Then, PERCLOS
research on the intelligent identification of drowsy driving is calculated by the number of eyes closed frames over a
has important realistic meanings. pried of time. As we all know, the characteristics of human
In recent years, thanks to the non-invasive and low cost, facial, especially the size of eyes, have certain differences,
methods of driving drowsiness detection based on driver which introduces a new variable when it comes to PERCLOS.
behavior have become a research hotspot. However, the per- In the past literatures [23], PERCLOS-80(P80) was used to
formance of algorithms might be limited by the technolo- judge whether the eyes of driver are open or closed. That
gies of face detection and fatigue assessing in complex and is, when the proportion of eyelid covering the pupil is over
changing surroundings [6], [7]. Reference [8] uses a mask 80 percent, the eyes are identified as closed in the current
to obtain the face of driver and evaluate the driving state frame. However, this method does not take into account the
with the application of PERCLOS. Experiments show that the individual differences of driver, especially the differences of
performance of the method is well good in ideal conditions. the size of eyes, which may cause misjudgment in practical
But the generalization performance of the method is affected applications.
by the fabrication of mask. The cascade face detector pro- In this paper, we propose a new algorithm to detect driv-
posed by Viola and Jones [9] utilizes Haar-like feature and ing drowsiness, which considers the individual differences
Adaboost to train cascaded classifier, which achieve good of the drivers. It consists of off-line training module and
performance in face detection. [10] extracts the multi-scale online monitoring module. Firstly, we design a deep cas-
feature of face using Gabor wavelet transform, and trains an caded convolutional neural network (DCCNN) to detect the
Adaboost cascaded classifier to select a most recognizable face region from live video. Based on this, the eyes land-
features for driver drowsy state discrimination. However, marks are obtained by the application of Dlib toolkit. A new
quite a few works [11]–[13] indicate that the method based parameter, Eyes Aspect Ratio (EAR), which calculated by
on Adaboost may degrade significantly in read-world appli- the coordinates of landmarks, is introduced to identify the
cation with larger visual variations. Reference [14] uses an state of eyes (open or closed). In the offline training module,
active near-infrared light source to obtain the stable image for a specific driver, there are two sets of the EAR, which
of eyes and tracks it using finite state machine. The fuzzy represent eyes-open and eyes-closed, are obtained from the
system is applied to evaluate the state of driver. However, face detected by DCCNN. And a unique support vector
the method requires the installation of more sophisticated machine (SVM) classifier is constructed which takes the two
sensors, which increases the cost. With the development sets of data into the input. In the online monitoring module,
of artificial intelligence technology, the capable of com- the driver’s face is detected by DCCNN from live video
puting has been greatly improved. The driving drowsiness firstly. Then the landmarks of driver’s face are obtained with
detection algorithms based on deep learning are increasingly the application of the Dlib toolkit as well. The Eyes Aspect
attracting attention. Reference [15] proposed a fatigue detec- Ratio can be calculated by the coordinates of the eyes. Finally,
tion algorithm based on the Convolutional Neural Network the state of eyes in the current frame can be classified by
(CNN). The algorithm classified human eyes and non-human the unique SVM. Thanks to this offline training and online
eyes by training the first network, detected the position of monitoring module, the performance of the algorithm can be
the eye feature points with the second network, calculated notably improved.
the eye-opening degree according to the feature point posi- The major contributions of this paper are summarized as
tion, and judged the fatigue state of driver by PERCLOS. follows:
Reference [16] introduced a detection method based on facial 1. We design a new deep cascaded convolutional neural
behavior analysis. It combined with Adaboost and kernel cor- network to detect the face of a driver, which effectively
relation filter (KCF) for face detection and tracking in facial improve the performance.
image captured from infrared acquisition device. The cascad- 2. We introduce a new parameter based on the Dlib
ing regression is exploited to locate the feature points and toolkit to assess whether the eyes of a driver are open or
extracted the eye and mouth regions. Also, the convolutional closed.
neural network (CNN) is employed to recognize the status of 3. Extensive experiments are conducted, to show the EAR
eyes and mouth. However, the deep learning methods are still is work in different drivers and the significant performance
in the exploration stage for the driving drowsiness detection. improved of the proposed approach in both accuracy and
A lot of works [17]–[21] were focus on exploring a universal speed.
VOLUME 7, 2019 179397

FIGURE 1. Pipeline of proposed approach.
4. Comparative experiments result show that the proposed Finally, take the two types of the EAR as positive and negative
algorithm, which with consideration of individual differ- samples, we training a unique SVM classifier for the specific
ences, is reasonable and more accurate. driver to judge whether the eyes are open or closed.
The structure of this paper is organized as follows. Online Monitoring: It is a real-time module to detect
In Section II, we describe the proposed methods in detail, driving drowsiness form live video. All frames are fed to
including the deep learning model for face detection and DCCNN during driving to find the face of a driver. If the
facial landmarks obtaining. The core approaches, offline face is captured, the eyes landmarks can be obtained by the
training and online monitoring, are introduced in this section Dlib toolkit. Then, the unique SVM classifier of the specific
as well. In Section III, we conduct extensive experiments driver, which trained in offline module, is applied to judge
to verify the rationality of the proposed algorithm in both whether the eyes are open or closed in the current frame.
accuracy and speed. The conclusion is in Section IV. At last, we assess whether the driver is sleepy or not according
to the ratio of the number of sleepy frames and total frames in
II. APPROACH a period of time. Besides, if DCCNN does not detect a face,
The overall pipeline of our approach is shown in Figure 1. we judge the driver is drowsy in the current frame maybe
The algorithm consists of the following two modules: because of the improper head posture.
Offline Training: It is equivalent to an initialization
process. As usually seen, the characteristics of a driver’s face, A. DEEP CASCADED CONVOLUTIONAL
especially the size of the eyes, have obvious differences. NEURAL NETWORK
Thus, we firstly obtain two sets of data by asking the driver Face detection is one of the most important technologies
to open his/her eyes and close his/her eyes for a while on a for driving drowsiness detection based on computer vision.
simulator. For each image data, applying the deep cascaded In practical application, driving drowsy detection system
convolutional neural network we designed called DCCNN to not only requires high accuracy, but also fast speed. As we
detect the face of drive in the current frame. Further, based know, deep learning methods, especially the convolutional
on the Dlib toolkit, facial landmarks can be obtained. Then, neural network model, greatly improve the accuracy of image
two types of EAR, as shown in Figure 1, the EAR1 relates recognition. However, the complex network structure reduces
the eye-open frames and the EAR2 is the value calculated the algorithm speed. In [24], a Multi-Task Cascaded Con-
when the state of driver’s eyes are closed, can be calculated. volutional Networks called MTCNN have been designed
179398 VOLUME 7, 2019

FIGURE 2. The architecture of DCCNN, where ‘‘conv’’ means convolutional layer, ‘‘MP’’ means max pooling layer and ‘‘fc’’ means fully
connected layer.
for face detection. The architecture consists of three sub- of the three sub-networks is cascaded. In DCCNN, the output
networks, instead of a complex network. Each sub-network of the previous sub-network acts as the input to the next sub-
has less numbers of filters but more discrimination of them, network. By building the image pyramid, the face of a driver
which effectively improves the speed of the algorithm. can be detected accurate no matter how larger proportion of
However, we noticed its performance might be limited by the the face in frames.
useless information about the five facial landmarks. Because In the training phase of DCCNN, the WIDER_FACE data
we just pay attention to the information of the facial in set [27] and the AFLW data set [28] are used as the training
video frames. Thus, we design a new convolutional neural data. Among them, the WIDER_FACE data set includes more
network to detect the face of a driver. It is more efficient than 30,000 pictures and 400,000 personal faces. It is the
thanks to the lightweight architecture and removing five largest and most complex face detection public data set in
facial landmarks. The architecture of Deep Cascaded Con- the world. The AFLW face database includes 25,000 hand-
volutional Neural Network (DCCNN) is shown in Figure 2. labeled face images. The database is widely used in face
In practical application, the proportion of the face in frames recognition, face detection, face alignment and other aspects.
is uncertain. Thus, it is an effective approach to resize the In this paper, the WIDER_FACE and AFLW datasets are
original image into different scales. Similar to MTCNN, used to crop some face images and non-face images based
we build an image pyramid by resizing the original image on manually labeled face regions. A total of 190,000 face
to different scales, which is the input for three cascaded images, 600,000 partial face images and 900,000 non-face
networks. Network-1 is a fully convolutional network [25], images are obtained.
to obtain a large number of candidate windows and their The procedure of the training of the DCCNN is aiming to
bounding box regression vectors. Then we use the estimated get the optimal model by adjusting the parameters dynami-
bounding box regression vectors to calibrate the candidates. cally. To express the influence of different parameters on the
Finally, non-maximum suppression(NMS) [26] is employed performance of the DCCNN, a loss function is introduced
to merge highly overlapped candidates. After Network-1, All during training. The loss function is an index to measure
candidates are fed to the next CNN, called network-2, which the difference between the predictive output and the actually
further reject a larger number of false candidates, performs marked label, and there are various loss functions according
calibration with bounding box regression, and NMS candi- to different tasks. The training process includes two tasks,
date merge. Network-3 is a CNN similar to network-2, but respectively, face and non-face classification task and face
it has more convolutional layers and pooling layers so that it region box bounding fitting task.
can describe the face in detail. The output of network-3 is the The first task, face and non-face is a classification task, so
facial bounding box with high confidence. The relationship we apply the cross entropy [29] loss function for the training.
VOLUME 7, 2019 179399

FIGURE 3. Diagram of CPR.
For any sample xi , the cross-entropy loss function is:

Li1 = −(y1i log(pi ) + (1 − y1i )(1 − log(pi ))) (1)
where pi is the predictive output of network, y1i is the real
label of xi (face/non-face). FIGURE 4. Facial landmarks obtain based on Dlib.
The second task is used to predict the coordinates of the
face region box bounding, which belongs to the regression
problem. Therefore, the Euclidean loss functions are applied
for training. The loss function is shown in equation (2).
2
Li2 = pi − y2i (2)
2
where pi is the coordinates of the network prediction face
FIGURE 5. Ellipse fitting. Left: original eye image, middle: segmented
region, and y2i is the real coordinate of the face region in the image, right: fitted ellipse.
image.
Through the training of the deep cascaded convolutional
neural network, the face of driver can be obtained accu- Update pose S t by:
rate, which provides a stable face image for the following
algorithms. S t = S t−1 + Sδ (5)
Finally, the output of CPR is the estimated pose of face S T .

B. FACIAL LANDMARKS AND EAR
Dlib [31] is an open source toolkit that includes machine
1) FACIAL LANDMARKS OBTAIN BASE ON CPR
learning algorithms and tools for creating complex algorithms
In this paper, the driving drowsiness is monitored by the eyes to solve real-world problems. It is widely used in industrial
state of driver. It is crucial to obtain the corner and shape of and academia, including robotics, embedded devices, mobile
eyes. Cascaded Pose Regression(CPR) [30] is a regression phones and large high-performance computing environments.
algorithm to estimate the pose of object. Particularly, the pose In Dlib toolkit, the pose of face is represented by 68 land-
of facial can be represented by feature points or landmarks. marks, as shown in Figure 4. The toolkit generates a regressor
Therefore, we introduce the CPR algorithm to estimate the to obtain the facial landmarks of driver after training the CPR
pose of driver by getting facial landmarks. mentioned above, with input of 68 labeled landmarks.
The diagram of CPR is shown in Figure 3. Given a S 0 ,
which is an initial pose for the first weakly regressor. It will 2) EAR: A STABLE PARAMETER FOR EYES STATE
return a final estimated pose S T after T iterations, and each EVALUATION
of predicted pose is processed by different weakly regressor
As mentioned, to a certain extent, the state of eyes indi-
Ri , i = 1, 2, . . . , T . Notice that the input of each weakly
cates whether the driver is drowsy or not. Because there are
regressor depends on the output of previous one and each
significant differences about time of eyes closed between
weakly regressor will extract the pose index feature x, which
awake and drowsy. In [32], a method of ellipse fitting was
is crucial to express the pose of face. In the process of
proposed to describe the shape of pupil. As shown in Figure 5,
training, all weakly regressor automatically learn from the
the method segments the pupil with traditional image process
train samples, which were labeled in advance. The labeled
firstly. Then, an ellipse is fitted with the white pixels, which
samples are 2D face images that containing the landmarks
represent the shape of eyes. Lastly, the ratio of the major and
of the feature points, that is, S = {x1 , y1 ; x2 , y2 ; . . . ; xn , yn },
minor axes of the ellipse was used to evaluate the eyes state.
where (xi , yi ) is one of the coordinate of landmarks.
We noticed its performance might be limited by
The training process of CPR is as follows:
the following facts: (1) The pixel values are sensitive.
Given a 2D face image I , from 1 to T do:
Changeable environment is easy to make image segmentation
Calculate the pose index feature by:
to be worse. (2) In practical application, the pixel values
x t = ht (S t−1 , I ) (3) between pupils and glasses are very close, which lead to
false ellipse fitting. In this paper, we design a new more
where ht is a pose-index features as [30].
stable parameter based on Dlib toolkit to evaluate the state of
Evaluate regressor by:
driver’s eyes. It is more stable and precise than ellipse fitting
Sδ = Rt (x t ) (4) method thanks to avoiding the traditional image process.
179400 VOLUME 7, 2019

Generally, there is a linear discriminant function f (x) =

wT xi + b in the d-dimensional space to distinguish two types
of data, and the classification hyperplane can be described as:
w∗T · x + b∗ = 0 (7)
The normal vector wT and the intercept b determine the
discriminant function, so the hyperplane parameters wT and b
need to be calculated from the training data set. According to
the basic idea of SVM, the constrained optimization problem
of linear separable support vector machine can be obtained:

min J (w) = 1 kwk2
w,b 2 2
(8)
s.t. yi (wT · xi + b) ≥ 1, i = 1, 2, . . . N
FIGURE 6. Eyes landmarks. Upper: the distribution of eyes landmarks has
significant differences. Bottom: the values of EAR at open and closed
state. In the offline training module, for a specific driver, we col-
lect two types of data when the eyes of driver are open
In section B-1), we have obtained the facial landmarks and closed. Marking the eyes-open data as positive samples
based on Dlib toolkit. As shown in Figure 6, for each eye, and eyes-closed data as negative samples. For a live video,
there are six points distributed around to locate the position the state of driver’s eyes cannot change suddenly, that is,
of eye. The distribution of eyes landmarks has significant dif- the eyes are almost impossible to change from open(closed) to
ferences between open and closed state. In [33], Eye Aspect closed(open) within one frame time. Therefore, we combine
Ratio was application to record the blink frequency. the EAR of six consecutive frames into one feature vector.
EAR can be computed according to the position of eyes According formula (8), we can get a unique classifier to judge
landmarks by: whether the eyes are open or closed.
kP2 − P6 k + kP3 − P5 k 2) ONLINE MONITORING

EAR = (6)
2 kP1 − P4 k Online monitoring module is the process of real-time
detection during driving, which considering the individual
where Pi , i = 1, 2, . . . , 6 is the coordinate of eyes land- differences of driver by application of the unique trained
marks. classifier. As shown in Figure 7, when the system starts up,
As shown in Figure 6, when the eyes of driver are open, a camera in front of the driver will capture the live video.
the EAR is over 0.2. In contrast, the EAR is less than 0.2. All live video will be processed frame by frame. Firstly,
Thanks to the stable facial landmarks location model based the DCCNN is used to detect face of driver. If the region
on CPR, the new parameter, EAR, is much more robust of face is obtained, it will be inputted to find the landmarks
than [32]. with Dlib toolkit. Otherwise, we assess the current frame
is drowsy directly because the head of driver may not be
C. DROWSINESS ASSESSMENT MODEL forward-looking. Next, facial landmarks are obtained using
1) OFFLINE TRAINING Dlib toolkit. Similarly, if landmarks acquisition fails, the
As mentioned, P80 uses a hard threshold (0.8) to judge current frame is judged to a sleepy image. Finally, the EAR
whether the eyes of driver are open or closed. It is not accurate will be calculated according to the eyes landmarks. And the
for different driver because of the individual differences on SVM classifier, which has trained in offline training module,
the size of eyes. We will discuss it in Experiments. In this is applied to decide whether the eyes of driver are open or
paper, we take the individual differences into consideration closed with the input of EAR. The number of drowsy frames
to construct a unique classifier. It is equivalent to finding a are stored with a parameter Ndrowsy .
soft threshold during the initialization process. It is not a fixed As mentioned, the driving drowsiness is a process of
threshold, but an adaptive threshold. To make it works, we ask dynamic change. In [35], PERCLOS proved to be an effec-
the driver to open his/her eyes and close his/her eyes for a tive indicator for driving drowsiness detection. The original
while, obtain two sets of EAR via DCCNN and Dlib toolkit. PERCLOS method consists of two parts: (1) Judge whether
A SVM classifier is trained with the input of two sets of data. the eyes are open or closed using P80, P70 or EM. Among
SVM [34] is a machine learning model that uses the struc- of them, 1) P70, that is, if the eyelid coverage pupil area
tural risk minimization criterion. It is a linear classifier model exceeds 70%, it is judged to be closed state; 2) P80, that
with the largest interval defined in the feature space. Suppose is, the eyelid coverage pupil area exceeds 80%, then it is
given a training data set S = {(xi , yi ), i = 1, 2, . . . N }, where judged to be closed state; 3) EM, that is, if the eyelid coverage
xi ∈ Rd is the input samples and yi ∈ {+1, −1} is the label pupil area exceeds 50%, it is judged to be in the closed state.
corresponding to xi . Assume xi is a positive sample if yi = +1 (2) Calculate the ratio of sleepy frames to total frames over
and on the contrary, it is a negative sample. time. In this paper, we have construct a unique classifier to
VOLUME 7, 2019 179401

FIGURE 8. SDVD collected on simulator. Left: video clip of SDVD. Right: an

annotated data. The driver is awake at 0 ∼ t1 and t2 ∼ t3. And he/she is
drowsy at t1 ∼ t2 and t3 ∼ 120 (simulated states).
TABLE 1. Data sets.
(http://vis-www.cs.umass.edu/fddb/index.html), which con-

tains the annotations for 5171 faces in a set of 2845 images.
The second is our own build video data sets, called Simulated
Driving Video Data set (SDVD), which collected in simu-
lated driving. For safety, we build the SDVD by asking the
driver to make driving operations on the simulator, including
the awake driving and the drowsy driving. SDVD contains
50 videos with different drivers, surroundings and times, etc.
FIGURE 7. Flow chart of online monitoring. The subjects have different gender, age and eyes size. Each
of them is 2 minutes and includes a manually marked awake
judge the eyes state instead of using the hard threshold such as
period and sleepy period. As shown in Figure 8. Besides,
P80. Similar original PERCLOS part 2, we calculate the ratio
in the process of DCCNN training, the WIDER_FACE
of drowsy frames to total frames over time as well. PERCLOS
dataset and AFLW dataset are applied to drive the training
can be computed by:
process. The information of the public datasets is shown in
Ndrowsy Table 1.
PERCLOS = × 100% (9)
Ntotal
where Ndrowsy is the number of drowsy frames judged by B. FACE DETECTION
classifier and Ntotal is the total number of frames in a specific 1) QUALITATIVE DESCRIPTION
time. In order to evaluate the performance of the proposed face
In the online monitoring module, we set the Ntotal as 100 detection networks, we first conduct experiments in lab-
frames. If PERCLOS > Th, where Th is the threshold oratory and real driving scenarios. The posture of head,
in a similar manner as [36], the driver is assessed driving light, background and facial decoration, etc. will have a
drowsiness. great impact on face detection. Thus, we take all those fac-
tor into consideration to design the qualitative description
III. EXPERIMENTS experiment. As shown in Figure 9, the face can be detected
In this section, we first evaluate the effectiveness of the pro- accurately in laboratory (the first row of Figure 9), because
posed DCCNN in Face Detection Data Set and Benchmark of the ideal lighting and simple surroundings. In practical
(FDDB) [37]. Then, we will discuss the correlation of EAR driving scenarios (the second and third rows of Figure 9),
and the size of eyes to describe the individual differences of the DCCNN can obtain the face of driver as well, even the
driver. At last, a series of comparative experiments are con- surroundings such as lighting is much more complicated than
ducted to evaluate the performance of our proposed algorithm the laboratory.
in both accuracy and speed.
2) QUANTITATIVE EVALUATION
A. ENVIRONMENT AND DATA SET In this section, we evaluate the effectiveness of the proposed
The experimental platform is Intel Core i5-7500 (main DCCNN in FDDB. In machine learning, Intersection over
frequency: 3.4GHz) with x86 architecture, GTX1070TI Union (IOU) [38] and accuracy [39] are indicators for evalu-
(CUDA: 9.0; CUDNN: 7.0) with Pascal architecture, 16G ating the performance of deep learning model.
DDR4 memory, opencv3.3.0 image library, deep learning IOU is a standard for measuring the accuracy of detect-
computing framework is Tensorflow 1.7. ing corresponding objects in a specific data set. In object
In this paper, we introduce two types of data sets detection based on Image process, Ground-truth Bounding
to conduct experiments. The first one is FDDB Box (GB) is the area labeled in data set, which represents the
179402 VOLUME 7, 2019

FIGURE 9. Face detection in different scenarios.
true position of the object and Predicted Bounding Box (PB)

is the region that predicted by deep learning model. IOU is
computed by:
T
S(GB PB)
IOU = S (10)
S(GB PB)
T S
where S(GB PB) is the area of overlap and S(GB PB) is
the area of union.
We introduce the IOU to indicate the similarity between
the labeled position of face (Ground-truth Bounding Box)
in FDDB data set and the predicted face position (Predicted
Bounding Box). As shown in Figure 10. The higher the
overlap of two bounding boxes, the higher of IOU value.
Ideally, IOU = 1 when the predicted bounding box of face
is same as the ground-truth bounding box in FDDB data set
label. Generally speaking, it is considered that the object is
FIGURE 10. Demonstration of IOU in face detection.
detected if IOU above 0.5. In this paper, we assume that the
face is detected correctly when IOU > 0.7. Network-3, can be trained and validated independently. The
To verify the performance of the face detection networks, test accuracy curves of three sub-networks are shown in
we use the framework of multi-threaded input data provided Figure 11, and the test loss curves are shown in Figure 12.
by Tensorflow to combine the training data to disrupt the data With the increase of steps of training, the test accuracy is
sequence. In the training process, we set the batch size to 384, improved gradually. After 50 epochs, the final test accuracy
and the initial learning rate is 0.01. When the evaluation index of three sub-networks are 94.6%, 97.4% and 98.8%, because
is no longer improved, the learning rate is reduced by 10 times of the progressive architecture of DCCNN.
to 0.01, and the end rate is set to 0.00001. An epoch means In summary, the DCCNN we designed can capture the face
that all training data are trained once. In this paper, we set of driver in various surroundings.
the epoch to 50. After an epoch of training, a batch of test
data from FDDB data set is fed to the network, to evaluate C. CORRELATION OF EAR AND THE SIZE OF EYES
the performance of the model. Face detection and eyes landmarks location are the basis of
As mentioned, the DCCNN is a cascaded CNN of multi- driving drowsiness detection. A new parameter, EAR, is pro-
task. The three sub-networks, Network-1, Network-2 and posed in this paper to assess the state of driver’s eyes. In order
VOLUME 7, 2019 179403

FIGURE 11. Accuracy of the DCCNN.
FIGURE 12. Loss of the DCCNN.
to verify the correlation of EAR and the size of eyes, which

reflects the individual differences of driver, 4 subjects with
different characteristics were selected. The size of eyes in
case of natural open are measured in advance and size:1 ∼
size:4 are represent the 4 types of driver. Respectively, the size
of eyes is 12mm, 16mm, 7mm and 20mm (size 1∼size 4).
In the process of experiments, each subject was asked to open
and close the eyes during the specific of time. The camera
in front of driver capture the frames in real-time and the
DCCNN is application to detect the face of driver. Then,
calculate the EAR in the condition of opening and closure
using Dlib toolkit.
Figure 13 shows the EAR curves of 4 subjects with
different size of eyes. It can be seen from the figure that when
FIGURE 13. EAR curve of 4 subjects.
the subject 1 (size: 1) is in the eyes opening state, the EAR
is about 0.18. And the value is about 0.13 when the eyes of
subject 1 are closed. At this time, the threshold of the driver Therefore, the threshold of classifier is about 0.25. Particu-
eyes state classifier based on SVM is about 0.16 according larly, the size of eyes between subject 3 and subject 4 are
to the optimal classifier principle. For subject 2 (size: 2), significant different to enhance our hypothesis. The size of
the EAR in different states are 0.3 and 0.15 respectively. object 3’s eyes are small in naturally open condition so that
179404 VOLUME 7, 2019

the EAR of different states are approximate. In contrast, TABLE 2. Driver fatigue detection accuracy comparison.
the EAR of object 4 in two types of state are significant
different.
Therefore, the correlation of the EAR and the size of eyes is
strong as we assumed above. Generally, the lager the driver’s
eyes are, the higher the EAR is. And the smaller the eyes are,
the lower the EAR is. It is necessary to develop the driving
drowsiness detection model with consideration of driver’s
individual differences, especially the size of eyes, to improve
the performance of algorithm.
D. COMPARATIVE EXPERIMENTS
In order to verify the rationality of the driving drowsiness
detection algorithm with consideration of individual differ-
ences, a series of comparative experiments are conducted.
Firstly, we compared the accuracy on driving fatigue detec-
tion of using the hard threshold (0.8) and the method proposed
in this paper. Then, we ask a driver to simulate the normal
driving and drowsy driving state on the simulator, so that the
change of eyes state and PERCLOS is visible clearly. Finally,
we conduct another comparative experiment to compare the
performance of our methods and relevant literatures both on
accuracy and speed. For fair comparison, we use the same
data and platform for both methods.
The comparative experiment is repeated 10 times.
1) COMPARE WITH P80 Table 2 shows the accuracy of driving drowsiness detection.
We select 10 segments of simulated driving video randomly It can be seen from the table that the algorithm we proposed
from the SDVD data set. The DCCNN is applied to detect improves the accuracy effectively.
the face of driver and then obtain the landmarks of face using
the Dlib toolkit. On this basis, a comparative experiment was 2) STATE OF EYES AND PERCLOS IN DIFFERENT
conducted. DRIVING CONDITION
Experiment 1: Calculate the EAR according to the land- We have description the correlation of EAR and the size of
marks of eyes. P80 is used to judge whether the eyes of driver eyes with four types of driver. In order to visually evaluate
are open or closed. That is, the eyes of driver are assessed to the changes of driver’s eyes state and the PERCLOS values
be closed if EAR < 0.2. Then, the driving state of driver is under different driving conditions. We ask a driver (size: 1) to
determined by the ratio of closed frames for a specific time simulate two different driving states. As shown in Figure 14.
(100 frames). It can be seen from Figure.14 that when the driver is in
Experiment 2: Initializing the system firstly by training a normal driving state, the duration of the eyes opening is much
unique SVM classifier with the input of the two sets of EAR longer than the time of closure. And the values of PERCLOS
value. Next, the driving state is assessed by using the online are all below the threshold (0.4). Conversely, the duration of
monitoring model proposed in this paper. closure is longer than the time of opening when the driver is in
As mentioned above, for safety, we conduct the com- drowsy state. Most of the PERCLOS are over the threshold.
parative experiments on the simulator. The SDVD data set
contains the driving video with labeled information. For a
sample, the driving state has been marked in advance as 3) COMPARE WITH RELEVANT LITERATURES
shown in Figure 8. The comparative experiments will detect Figure 15 shows the calculation speed of each module of
the driving state and record the period of awake and drowsy. the algorithm during the experiment of the four subjects.
If the result obtained by the driving drowsiness model is The experimental video stream is 640∗480 resolution and the
consistent with the labeled information, it is considered that frame rate is 30fps. It can be seen from the figure that the
the model gets a correct evaluation of driving state. Therefore, average speed of the face detection module is 35.95ms/f,
the accuracy can be computed by: and the average speed of the landmarks obtaining module is
13.80 ms/f. The average speed of the fatigue driving detection
Nc algorithm we proposed is 49.75 ms/f.
accuracy = (11)
N In order to furtherly verify the performance of the proposed
where Nc is the number of video clips correctly evaluated and algorithm, we use the self-built data set, SDVD, to compare
N is the total number of experiment samples. our algorithm with the drowsy detection algorithms [40] [41]
VOLUME 7, 2019 179405

FIGURE 14. Eyes state and PERCLOS under different driving conditions.
IV. CONCLUSION
Research on drowsy driving detection algorithm is one of
the most important methods to reduce traffic accidents.
As we know, there are significant individual differences,
especially the size of eyes, between different person. It is
crucial to take the individual differences into consideration
when study on the algorithm based on computer vision.
In this paper, we propose a new driving drowsiness detec-
tion algorithm with consideration of individual differences.
Firstly, we design a deep cascaded convolutional neural
network model named DCCNN, which avoids the process
of artificial feature extraction in traditional face detection
algorithms, to obtain the face of a driver in live video. The
performance of the model is tested by qualitative description
and quantitative evaluation. Experimental results show that
FIGURE 15. Algorithm speed. the accuracy of face detection can reach at 98.8%. Sec-
ondly, we propose a new parameter, EAR, based on the Dlib
TABLE 3. Comparison. toolkit, to assess the state of driver’s eyes. Compared with
the traditional methods, the EAR is more stable thanks to the
Cascaded Pose Regression algorithm. Experimental results
show that there is a very strong correlation between the EAR
and the size of a driver’s eyes, which proves the rationality
of our ideas. Finally, taking the individual differences of the
drivers into consideration, we construct the offline training
module and online monitoring module in the paper. A unique
classifier based on SVM is trained for a specific driver and the
state of eyes is judged with the application of the pre-trained
proposed in recent years. The detection accuracies and speeds classifier during driving. Experimental results demonstrate
are shown in Table 3. that our methods consistently outperform the state-of-the-art
179406 VOLUME 7, 2019

methods on the public data set and self-built data set while [17] Z. Xiao, Z. Hu, L. Geng, F. Zhang, J. Wu, and Y. Li, ‘‘Fatigue driving
keeping real-time performance. recognition network: Fatigue driving recognition via convolutional neural
network and long short-term memory units,’’ IET Intell. Transp. Syst.,
In the future, we will focus on the following topic: vol. 13, no. 9, pp. 1410–1416, Sep. 2019.
1) explore the multi-feature such as mouth and head pos- [18] W. Liu, ‘‘Convolutional two-stream network using multi-facial feature
ture fusion methods to further improve the algorithm perfor- fusion for driver fatigue detection,’’ Future Internet, vol. 11, no. 5, p. 115,
2019.
mance. 2) conduct the driving drowsiness detection research [19] Z. Ning, P. Dong, X. Wang, J. Rodrigues, and F. Xia, ‘‘Deep reinforcement
at nighttime because it is easier to drowsy driving at night. learning for vehicular edge computing: An intelligent offloading system,’’
ACM Trans. Intell. Syst. Technol., vol. 10, no. 6, 2019, Art. no. 60.
[20] X. Wang and C. Xu, ‘‘Driver drowsiness detection based on non-intrusive
ACKNOWLEDGMENT metrics considering individual specifics,’’ Accident Anal. Prevention,
This work was partially support by "Guangdong Natural vol. 95, pp. 350–357, Oct. 2016.
Science Foundation 2020- Front Vehicles Perception and [21] L. Zhang, F. Liu, and J. Tang, ‘‘Real-time system for driver fatigue detec-
tion by RGB-D Camera,’’ ACM Trans. Intell. Syst. Technol. (TIST), vol. 6,
Cooperative Control of Safety Distance for Driverless Vehicle no. 2, pp. 1–17, 2015.
at Nighttime". [22] R. Fu, H. Wang, and W. Zhao, ‘‘Dynamic driver fatigue detection using hid-
den Markov model in real driving condition,’’ Expert Syst. Appl., vol. 63,
pp. 397–411, Nov. 2016.
REFERENCES [23] Y. Jiang, S. Guo, and S. Deng, ‘‘Denoising and chaotic feature extrac-
[1] A. Amodio, M. Ermidoro, D. Maggi, S. Formentin, and S. M. Savaresi, tion of electrocardial signals for driver fatigue detection by kolmogorov
‘‘Automatic detection of driver impairment based on pupillary light entropy,’’ J. Dyn. Syst., Meas. Control, Trans. ASME, vol. 141, no. 2, 2019,
reflex,’’ IEEE Trans. Intell. Transp. Syst., vol. 20, no. 8, pp. 3038–3048, Art. no. 0210131, doi: 10.1115/1.4041355.
Aug. 2019, doi: 10.1109/TITS.2018.2871262. [24] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, ‘‘Joint face detection and alignment
[2] X. Li, X. Lian, and F. Liu, ‘‘Rear-end road crash characteristics analysis using multitask cascaded convolutional networks,’’ IEEE Signal Process.
based on Chinese in-depth crash study data,’’ in Proc. Cota Int. Conf. Lett., vol. 23, no. 10, pp. 1499–1503, Oct. 2016.
Transp. Prof., 2016, pp. 1536–1545. [25] Z. Ning, J. Huang, X. Wang, J. Rodrigues, and L. Guo, ‘‘Mobile edge
[3] G. Zhang, K. K. Yau, X. Zhang, and Y. Li, ‘‘Traffic accidents involving computing-enabled Internet of Vehicles: Toward energy-efficient schedul-
fatigue driving and their extent of casualties,’’ Accident Anal., Prevent., ing,’’ IEEE Netw., vol. 33, no. 5, pp. 198–205, Sep./Oct. 2019.
vol. 87, pp. 34–42, Feb. 2016, doi: 10.1016/j.aap.2015.10.033. [26] H. Xiong, X. Zhu, and R. Zhang, ‘‘Energy recovery strategy numeri-
[4] D. Mollicone, K. Kan, C. Mott, R. Bartels, S. Bruneau, M. van Wollen, cal simulation for dual axle drive pure electric vehicle based on motor
A. R. Sparrow, and H. P. A. Van Dongen, ‘‘Predicting performance and loss model and big data calculation,’’ Complexity, vol. 2018, Aug. 2018,
safety based on driver fatigue,’’ Accident Anal., Prevention, vol. 126, Art. no. 4071743, doi: 10.1155/2018/4071743.
pp. 142–145, May 2019, doi: 10.1016/j.aap.2018.03.004. [27] S. Yang, P. Luo, C. C. Loy, and X. Tang, ‘‘WIDER FACE: A face detec-
[5] X. Sun, H. Zhang, W. Meng, R. Zhang, K. Li, and T. Peng, ‘‘Primary reso- tion benchmark,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
nance analysis and vibration suppression for the harmonically excited non- Jun. 2016, pp. 5525–5533, doi: 10.1109/CVPR.2016.596.
linear suspension system using a pair of symmetric viscoelastic buffers,’’ [28] M. Köstinger, P. Wohlhart, P. M. Roth, H. Bischof, ‘‘Annotated facial land-
Nonlinear Dyn., vol.94, no. 2, pp. 1243–1265, 2018, doi: 10.1007/s11071 marks in the wild: A large-scale, real-world database for facial landmark
-018-4421-9. localization,’’ in Proc. IEEE Int. Conf. Comput. Vis. Workshops, Nov. 2011,
[6] Z. Ning, P. Dong, X. Wang, M. S. Obaidat, X. Hu, L. Guo, Y. Guo, J. Huang, pp. 2144–2151, doi: 10.1109/ICCVW.2011.6130513.
B. Hu, and Y. Li, ‘‘When deep reinforcement learning meets 5G vehicular [29] Z. Ning, J. Huang, and X. Wang, ‘‘Vehicular fog computing: Enabling
networks: A distributed offloading framework for traffic big data,’’ IEEE real-time traffic management for smart cities,’’ IEEE Wireless Commun.,
Trans. Ind. Informat., to be published, doi: 10.1109/TII.2019.2937079. vol. 26, no. 1, pp. 87–93, Feb. 2019.
[7] F. You, Y.-H. Li, L. Huang, K. Chen, R.-H. Zhang, and J.-M. Xu, ‘‘Moni- [30] P. Dollár, P. Welinder, and P. Perona, ‘‘Cascaded pose regression,’’ in
toring drivers’ sleepy status at night based on machine vision,’’ Multimed. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Jun. 2010,
Tools Appl., vol. 76, no. 13, pp. 14869–14886, 2017, doi: 10.1007/s11042 pp. 1078–1085, doi: 10.1109/CVPR.2010.5540094.
-016-4103-x. [31] D. E. King, ‘‘Dlib-ml: A machine learning toolkit,’’ J. Mach. Learn. Res.,
[8] G. Sikander and S. Anwar, ‘‘Driver fatigue detection systems: A review,’’ vol. 10, pp. 1755–1758, Jan. 2009.
IEEE Trans. Intell. Transp. Syst., vol. 20, no. 6, pp. 2339–2352, Jun. 2019, [32] A. Tolba, ‘‘Content accessibility preference approach for improving
doi: 10.1109/TITS.2018.2868499. service optimality in Internet of Vehicles,’’ Comput. Netw., vol. 152,
[9] X. Li, L. Hong, J. Wang, and X. Liu, ‘‘Fatigue driving detection model pp. 78–86, Apr. 2019.
based on multi-feature fusion and semi-supervised active learning,’’ IET [33] X. Kong, F. Xia, J. Li, M. Hou, M. Li, and Y. Xiang, ‘‘A shared
Intell. Transp. Syst., vol. 13, no. 9, pp. 1401–1409, Sep. 2019. bus profiling scheme for smart cities based on heterogeneous mobile
[10] G. Niu and C. Wang, ‘‘Driver fatigue features extraction,’’ Math. Problems crowdsourced data,’’ IEEE Trans. Ind. Informat., to be published, doi:
Eng., vol. 2014, pp. 1–10, Jun. 2014. 10.1109/TII.2019.2947063.
[11] Z. Ning, Y. Feng, M. Collotta, X. Kong, X. Wang, L. Guo, X. Hu, [34] J. Xu, W. Zeng, Y. Lan, J. Guo, and X. Cheng, ‘‘Modeling the parameter
and B. Hu, ‘‘Deep learning in edge of vehicles: Exploring trirelation- interactions in ranking SVM with low-rank approximation,’’ IEEE Trans.
ship for data transmission,’’ IEEE Trans. Ind. Informat., vol. 15, no. 10, Knowl. Data Eng., vol. 31, no. 6, pp. 1181–1193, Jun. 2019.
pp. 5737–5746, Oct. 2019. [35] A. Tolba, ‘‘Trust-based distributed authentication method for collision
[12] M. Khan and S. Lee, ‘‘A comprehensive survey of driving monitoring and attack avoidance in VANETs,’’ IEEE Access, vol. 6, pp. 62747–62755,
assistance systems,’’ Sensors, vol. 19, no. 11, p. 2574, 2019. 2018.
[13] A. S. Zandi, A. Quddus, L. Prest, and F. J. E. Comeau, ‘‘Non-intrusive [36] R.-H. Zhang, Z.-C. He, H.-W. Wang, F. You, and K.-N. Li, ‘‘Study on
detection of drowsy driving based on eye tracking data,’’ Transp. Res. Rec., self-tuning tyre friction control for developing main-servo loop integrated
J. Transp. Res. Board, vol. 2673, no. 6, pp. 247–257, May 2019. chassis control system,’’ IEEE Access, vol. 5, pp. 6649–6660, 2017.
[14] L. M. Bergasa, J. Nuevo, M. A. Sotelo, R. Barea, and M. E. Lopez, ‘‘Real- [37] X. Sun, P. Wu, and S. C. H. Hoi, ‘‘Face detection using deep learn-
time system for monitoring driver vigilance,’’ IEEE Trans. Intell. Transp. ing: An improved faster RCNN approach,’’ Neurocomputing, vol. 299,
Syst., vol. 7, no. 1, pp. 63–77, Mar. 2006. pp. 42–50, Jul. 2018.
[15] X. Zhao, ‘‘Eye feature point detection based on single convolutional [38] X. Kong, X. Liu, B. Jedari, M. Li, L. Wan, and F. Xia, ‘‘Mobile crowdsourc-
neural network,’’ IET Comput. Vis., vol. 12, no. 4, pp. 453–457, ing in smart cities: Technologies, applications, and future challenges,’’
2018. IEEE Internet Things J., vol. 6, no. 5, pp. 8095–8113, Oct. 2019.
[16] F. Zhang, J. Su, L. Geng, and Z. Xiao, ‘‘Driver fatigue detection based [39] A. Tolba, O. Said, and Z. Al-Makhadmeh, ‘‘MDS: Multi-level decision sys-
on eye state recognition,’’ in Proc. Int. Conf. Mach. Vis. Inf. Technol., tem for patient behavior analysis based on wearable device information,’’
Feb. 2017, pp. 105–110, doi: 10.1109/CMVIT.2017.25. Comput. Commun., vol. 147, pp. 180–187, Nov. 2019.
VOLUME 7, 2019 179407

[40] L. Geng, ‘‘Real-time driver fatigue detection based on morphology infrared YUNBO GONG was born in Tieling, Liaoning,
features and deep learning,’’ Hongwai Yu Jiguang Gongcheng/Infrared in 1995. He received the B.E. degree from
Laser Eng., vol. 47, no. 2, 2018, Art. no. 203009. the South China University of Technology,
[41] J. Guo and H. Markoni, ‘‘Driver drowsiness detection using hybrid convo- Guangzhou, China, in 2018, where he is currently
lutional neural network and long short-term memory,’’ Multimedia Tools pursuing the master’s degree in traffic informa-
Appl., vol. 78, no. 20, pp. 29059–29087, 2019. tion engineering and control. His interests include
intelligent vehicles, computer vision, and 3D laser
radar.
FENG YOU was born in Zunyi, Guizhou, in 1977. HAIWEI WANG received the B.E. and M.S.
He received the B.S. degree in automobile trans- degrees from Jilin University, Changchun, China,
portation engineering from Guizhou University, and the Ph.D. degree from the South China Uni-
Guizhou, in 1995, and the M.S. and Ph.D. versity of Technology. She is currently a Lecturer
degrees in automobile transportation engineer- with the School of Transportation and Economic
ing from Jilin University, Changchun, in 2005. Management, Guangdong Communication Poly-
From 2005 to 2011, he was a Lecturer with technic, Guangzhou, China. She has published
the School of Civil Engineering and Transporta- four articles in international journals. Her current
tion, South China University of Technology. Since research interests include ITS and vehicle control.
2012, he has been an Associate Professor with the
School of Civil Engineering and Transportation, South China University of
Technology. He is the author of more than 50 articles, and more than ten
inventions. His research interests include connected vehicle, advanced driver
assistance, traffic information and safety, cooperative safety control, and
vehicle active safety and driver behavior. He is a Reviewer of the Journal
of Safety Science, Accident Analysis & Prevention, and Jilin University
HONGYI LI was born in Urumqi, Xinjiang, China,
Engineering and Technology Edition.
in 1986. He received the B.S. degree in mechanic
from the Beijing Institute of Technology, Beijing,
in 2008, and the Ph.D. degree in physics and chem-
istry materials from the Xinjiang Technical Insti-
tute of Physics and Chemistry, Chinese Academy
XIAOLONG LI was born in Bengbu, Anhui, of Sciences, Urumqi, in 2013. From 2011 to 2012,
in 1993. He received the B.E degree from he was a dual culture Ph.D. with Northwestern
Southeast University, Nanjing, China, in 2016. University, Chicago, USA. From 2013 to 2014,
He is currently pursuing the master’s degree he was a Senior Engineer. Since 2015, he has been
in communication and transportation engineer- a Professor with the Xinjiang Quality of Products Supervision and Inspection
ing with the South China University of Tech- Institute of Technology. Besides, he was a Visit Research Fellow with the
nology, Guangzhou, China. His interests include Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of
intelligent vehicles, computer vision, and driving Sciences. His research interests include materials science and information
behavior analysis. technology.
179408 VOLUME 7, 2019

Drowsiness Detection 2

Uploaded by

Copyright:

Available Formats

Drowsiness Detection 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Drowsiness Detection 2

Uploaded by

Copyright:

Available Formats

SPECIAL SECTION ON ARTIFICIAL INTELLIGENCE (AI)-EMPOWERED INTELLIGENT

A Real-time Driving Drowsiness Detection

Corresponding authors: Haiwei Wang (whw2046@126.com) and Hongyi Li (lihy@ms.xjb.ac.cn)

I. INTRODUCTION to the record of National Sleep Foundation, about 32% of

VOLUME 7, 2019 179397

FIGURE 1. Pipeline of proposed approach.

179398 VOLUME 7, 2019

VOLUME 7, 2019 179399

FIGURE 3. Diagram of CPR.

For any sample xi , the cross-entropy loss function is:

Finally, the output of CPR is the estimated pose of face S T .

179400 VOLUME 7, 2019

Generally, there is a linear discriminant function f (x) =

kP2 − P6 k + kP3 − P5 k 2) ONLINE MONITORING

VOLUME 7, 2019 179401

FIGURE 8. SDVD collected on simulator. Left: video clip of SDVD. Right: an

TABLE 1. Data sets.

(http://vis-www.cs.umass.edu/fddb/index.html), which con-

179402 VOLUME 7, 2019

FIGURE 9. Face detection in different scenarios.

true position of the object and Predicted Bounding Box (PB)

VOLUME 7, 2019 179403

FIGURE 11. Accuracy of the DCCNN.

FIGURE 12. Loss of the DCCNN.

to verify the correlation of EAR and the size of eyes, which

179404 VOLUME 7, 2019

VOLUME 7, 2019 179405

179406 VOLUME 7, 2019

VOLUME 7, 2019 179407

179408 VOLUME 7, 2019

You might also like