Advanced Driver Assistant System: Zihui Liu Chen Zhu Department of Electrical Engineering

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Advanced Driver Assistant System 2.

Related Work

Zihui Liu zihui6@stanford.edu 2.1 Traffic Sign Recognition


Chen Zhu chen0908@stanford.edu
D.M. Gavrila [1] implemented template-based
Department of Electrical Engineering
correlation method to identify potential traffic signs
Abstract
in images called distance transforms as detection
In this project, we designed an advanced driver method. Then radial basis function network is used
assistant system with three functions: traffic sign for classification. Miguel A Garcia-Garrido et al. [2]
recognition, lane deviation detection, and car detected the shape of traffic signs using Hough
make identification. The traffic sign recognition Transform and used several SVM classifier with
implements Viola and Jones detector to Gaussian kernel and probability estimate output to
accurately detect the traffic sign ahead. Hough determine about the information on the traffics signs.
transforms and edge detection are used in the
2.2 Lane Deviation Detection
lane deviation detection. For vehicle make
identification, two models using 2-layer neural Jia He et al. [3] used the Canny detector to find the
network and convolutional neural network have image gradient and then set threshold for edge
been constructed. These two models will provide detection. Then they implemented the Hough
high accuracy testing result after training transform as linear model fitting and then set the
processes. region of interest to achieve lane detection. Anik
Saha et al. [4] implemented steps including selecting
Keywords: Object Detection, Deep Learning, Edge
RGB image, converting RGB to grayscale,
Detection, Hough Transform
extracting feature, subtracting unwanted recognizing
1. Introduction road and road lane for real-time automated road lane
detection. Prof. Sachin Sharma et al. [5] used the
Advanced driver assistant systems (ADAS) have
algorithms following steps of Top-Hat Transform,
been implemented in many vehicles to help increase
Dynamic Threshold, ROI Segmentation, Hough
both the safety of drivers and pedestrian. The related
Transform, Lane Departure decision.
technology is also used to develop self-driving cars.
Three features including traffic sign recognition, 2.3 Car Make Identification
lane deviation detection and car make identification
Sparta Cheng et al. [6] implemented steps including
in ADAS that our team built. Traffic sign
interest point detection using Scale Invariant Feature
recognition is crucial to remind the drivers the traffic
Transform (SIFT) and Harris corner detection,
signs ahead in order to prevent accidents caused by
interest point matching using Fast Normal Cross
the traffic sign ignorance in the bad weather
Correlation, and inlier extraction using RANSAC.
condition. It can be seen as another eyes to guarantee
driving safety on the road. Lane deviation detection 3. Algorithm and Implementation
system provides lane detection and stability 3.1 Lane Detection
determination, giving drivers warning in the
condition that car drifting into directions out of lane. Lane deviation detection includes the functions of
Car make identification is a useful feature when lane detection and stability determination. The
drivers are interested in the make and model of car algorithms contain two main parts, the edge
in their front. We will review these three features in detection and Hough Transform & Hough peak
details in following sections. detection. For each frame in the video, first set the
area in the image for further image processing. Our
team typically selected the lower area of each picture
frame in order to get less noise from the background
part of image. The edge detection algorithm then
applies to the region selected. Using the Hough 3.2 Traffic Sign Detection
transform, the angles of lanes can be found in each
We use viola and Jones Detector [7] to detect traffic
frame and lanes can be successfully detected. For
signs.
stability determination feature, we match the lane
markers found in the current video frame with the 3.2.1 Algorithm
lane markers detected in the previous video frames. 1) Features
The algorithm will warn the driver if the vehicle
moves across the lane marker. The algorithm is Viola and Jones propose to use Harr-like feature
pretty robust against multiple road conditions. For extractor to extract features:
the parameter setting of the algorithm, the number of 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 = ∑𝑝𝑖𝑥𝑒𝑙𝑠 𝑖𝑛 𝑤ℎ𝑖𝑡𝑒 − ∑𝑝𝑖𝑥𝑒𝑙𝑠 𝑖𝑛 𝑏𝑙𝑎𝑐𝑘
row in the image being process starts from row 800
since the size of the image is 1080 x 1920. The Then we normalize the features to make it have a
maximum allowable change of lane distance metric norm of 1:
between two frames is 50 pixels. The minimum 𝑓𝑒𝑎𝑡𝑢𝑟𝑒
number of frame a lane must be detected to define a 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 =
𝑎𝑟𝑒𝑎𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 𝑤𝑖𝑛𝑑𝑜𝑤
valid lane is 10 while the maximum number of
frame can be missed without marking it invalid is 10. According to Viola et al., for a 24 by 24 feature
By setting those parameters, the algorithms can have window, we can extract approximately 160,000
good performance against different driving dimensions of features.
circumstances. Our work is based on MATLAB
Computer Vision System Toolbox. Below is the
result when testing on the real-time driving video.

Figure 2: Harr-like Feature Templates


2) Classification
Figure 1.a: Result 1 in real time video 160,000 dimensional feature is a large number.
Viola et al. assume in the paper that only a small
fraction of these features are useful and that the main
problem is how to find useful features for this small
fraction. Adaboost classifier is a suitable suitable
choice, because this classifier has feature selection
capabilities. The principle of Adaboost is to
construct a strong classifier connected in parallel by
multiple weak learners. Each weak learner multiplies
its classification result by weights according to its
Figure 1.b: Result 2 in real time video own accuracy. The final output is the sum of all
classifiers output. The classification accuracy of
each weak classifier can be very low, but the
accuracy of the whole strong classifier is very high.
Traditional adaboost classifier is still too time-
consuming. Viola et al. creatively modified the
adaboost classifier to a cascade of several adaboost
classifiers. Each classifier has high true positive rate
(about 99%), while false positive rate is also high
(about 50%). However, if we cascade 20 such small
adaboost classifier, the false positive rate will
become (50%)20 = 9.5 × 10−7 , while the true
positive rate remains high.
3) Optimizing efficiency by computing integral
image
Viola et al. propose to use integral image to improve
computing efficiency.

𝐼𝑖𝑚𝑔 (𝑥, 𝑦) = ∑ 𝑖𝑚𝑔(𝑖, 𝑗)


𝑖≤𝑥,𝑗≤𝑦

This can be done within O(mn). Once we have the


integral image, we can compute one feature within
constant time.
𝑠𝑢𝑚(𝑎, 𝑏, 𝑐, 𝑑) = 𝐼𝑖𝑚𝑔 (𝑐, 𝑑) + 𝑠𝑢𝑚(𝑎, 𝑏) − 𝑠𝑢𝑚(𝑎, 𝑑) − 𝑠𝑢𝑚(𝑐, 𝑏)
Figure 4: Traffic Sign Detection Results

3.2.2 Implementation Detail and Result


3.3 Car Make Identification
In our project, we choose three traffic signs to detect:
stop sign, keep right sign, pedestrian crossing sign. Car make identification based on the behind view
LISA Traffic Sign Dataset is selected to train our has been a challenging problem. Since pictures of
feature detector. For each traffic sign, we use 60~80 cars are 3 dimensional images, traditional image
training samples. We cascade 10 adaboost classifier matching method such as SIFT or SURF descriptor
for classification. Our code is built upon MATLAB cannot give accurate results. Besides, the difference
vision toolbox. among different sedan cars are trivial, it’s also
difficult to achieve accurate results using PCA or
Fisher LDA. To achieve more accurate results, we
propose to use deep learning methods. In our project,
we tried two method. One is a regular 2-layer neural
Figure 3: Selected Traffic Signs networks, and the other one is convolutional neural
Below are some of our results. Once the feature networks. Both give better results compared to
detectors are trained, they can detect our interested traditional image matching methods.
areas very efficiently. Using this method, we can do
real-time traffic sign detection.

Figure 5.a: Original images of Hyundai Sonata


a comprehensive database for rear view of vehicles
on the web and the number of vehicle rear view
picture from google images is limited. Also there are
time and road condition limitations for our team to
take rear view pictures from the vehicles on the road.
Our team chooses these three sedans as the target of
Figure 5.b: SIFT key points overlaid on images investigation not only because they are cars
normally seen on the road but also because they
have similar appearances and not easy to identify
even for the human eyes. In the training process, 100
out of 120 (83.3%) photos from each car make
randomly chosen from the data set. The test data is
the remaining 20 out of 120 (16.7%) photos from
Figure 5.c: Feature correspondences after distance ratio test each vehicle make.
3.3.3 2-layer Neural Networks
1) Architecture

Figure 5.d: Feature correspondence after RANSAC

3.3.1 SIFT Descriptor


For car make identification, we first came up with Figure 6: Architecture of our neural networks
using the SIFT descriptor and SIFT matching with
Our input images are 64 by 64 by 3 RGB images.
RANSAC. The matching results of two images from
The neural networks we use have two fully
Hyundai Sonata are shown in Figure 5.
connected layer. In each FC layer, there are 64
From Figure 5, we observe that only a very few neurons. Each neuron is has the same dimension as
points get matched while those points don’t indicate the input image. In the output layer, we choose to
the same feature on the vehicle. SIFT matching is use softmax loss function to output class scores and
robust with matching among planes. However, it select the class label which has the largest class
cannot handle 3D image matching. score as our output class label.

3.3.2 Data Collection 2) Regularization

In order to train the models on 2-layer neutral To prevent overfitting, we apply L2 regularization to
network and convolutional neutral network. For each our networks. We penalize the squared magnitude of
1
vehicle model (Acura ILX, Honda Civic, Hyundai all parameters by adding the term 2
𝜆𝜔2 to every
Sonata) investigated in this project, 120 photos of weight of our networks, where λ is the regularization
each car make rear view have been collected. So strength.
there are totally 360 photos for the entire data set. A
portion of photos are obtained from google images 3) Results
while another portion we collected using photos
Below is the weight of the second FC layer of our
taken on I-101 and I-280. There are some constraints
neural networks. Loss history can be seen from
on gathering data. One constraint is that there is not
Figure 7 and Figure 8. As can be seen from, our loss
converges after around 5000 iterations. The accuracy
on the testing set can be up to 75%.
3) Result
Our loss function converges after around 6000
iterations and can achieve up to 78% accuracy on
our testing set.

Figure 7: Weights of the 2nd layer

Figure 11: Loss History of CNN Training

4. Discussion and Future Work


Figure 8: Loss History of Training 4.1 Accuracy of Car Make Identification

3.3.4 Convolutional Neural Networks For the project, we only implement two simple
1) Architecture neural networks. They give relatively good result
compared to SIFT matching. Since we only have
We implement a simple convolutional neural 100 training images for each car make, the trained
networks based on Tensorflow. Our networks have 3 networks might be biased so that the accuracy stays
convolutional layers. Each convolutional layer around 75%~78%. We believe the accuracy can be
comes with a max pooling layer to do down- improved by using larger datasets. Besides, in our
sampling. The filters we use in each layer are results, a regular neural networks and a
specified in the figure 10. We also use a softmax convolutional neural networks both achieve
classifier to output class scores. satisfying results. However, we still consider
convolutional neural networks a better method to do
2) Regularization
car make identification. Regular neural networks
This time we use dropout [9] method to prevent will fail to handle huge number of parameters and
from overfitting. While training, neurons are lead to overfitting when input images are large [8].
dropped out (set to zero) at certain probability. No With larger datasets and deeper convolutional neural
dropout is implemented during testing. networks, our proposed method can be robust and
accurate in car make recognition.

Figure 9: Dropout. Figure taken from “Dropout: A Simple


Way to Prevent Neural Networks from Overfitting” that
illustrate the idea
Figure 10:
Architecture of
Convolutional
Neural Networks

keep right signs and pedestrian crossing signs, we


get a few false positives. Here comes a paradox.
When people design a traffic sign, they want to
4.2 Removing False Positives make it simple and easy to remember. However,
what are simple to human can confuse computers
Although cascading several adaboost classifiers can sometimes, for they do not distinguish themselves
effectively reduce false positive rate. Our results still from other objects well enough.
suffer from false positive detections when doing real
time tracking as shown in Figure 13. False positives 4.3 Future Work
confuse the drivers and can be dangerous. Hence we
Future improvement will mainly focus on improving
did several optimization to reduce false detection.
the robustness of the system in the areas of traffic
sign detection and vehicle make identification. To
increase the robustness and the accuracy of the
traffic signal detection on any signs against different
weather condition, more dataset need to be collected
and trained in the more complex algorithm model,
especially for traffic signals like “keep on right lane”.
For the vehicle model identification, deeper
convolutional neural network will be implemented
and larger training data set will also be collected in
Figure 12: False Positive Detection order to substantially increase the accuracy of
vehicle make detection on the real-time road.
First, it is observed that false positive areas are not
stable. It may appear in one frame, but disappear in 5. Acknowledgement
another frame. So we only keep those detected areas We would like to thank Professor Gordon Wetzstein
that appear in several frames as “true positive area”. for his support and instruction throughout the course.
Second, most traffic signs are on the right side of the We also want to thank Hershed Tilak for his
video. Our detector only target on those areas that mentorship on our project. Finally, many thanks to
are most likely to have traffic signs. This not only Yiran Deng and Haoxuan Chen, for their help on
reduces false positive rates, but improves the teaching us to use Python and Tensorflow.
efficiency as well.
However, there are still some false positives that
cannot be removed by our proposed methods. Figure
12 is an example. It can be seen that the detected
areas are very similar to the keep right sign. We find
that traffic signs that have relatively simple patterns
are more likely to be falsely detected, since their
features are not distinct enough. Our method works
pretty well for stop signs. However when it comes to
6. Reference 7. Appendix
[1] Gavrila, D. M. (1999). Traffic sign recognition Traffic Sign Recognition: Chen Zhu
revisited. In Mustererkennung 1999 (pp. 86-93).
Vehicle Deviation Detection: Zihui Liu
Springer Berlin Heidelberg.
Vehicle Make Detection: Chen Zhu & Zihui Liu
[2] García-Garrido, M. A., Ocana, M., Llorca, D. F.,
Arroyo, E., Pozuelo, J., & Gavilán, M. (2012). Proposal: Chen Zhu & Zihui Liu
Complete vision-based traffic sign recognition
supported by an I2V communication system. Poster: Chen Zhu & Zihui Liu
Sensors, 12(2), 1148-1169. Final Report: Chen Zhu & Zihui Liu
[3] He, J., Rong, H., Gong, J., & Huang, W. (2010,
November). A lane detection method for lane
departure warning system. In Optoelectronics and
Image Processing (ICOIP), 2010 International
Conference on (Vol. 1, pp. 28-31). IEEE.
[4] Saha, A., Roy, D. D., Alam, T., & Deb, K.
(2012). Automated road lane detection for intelligent
vehicles. Global Journal of Computer Science and
Technology, 12(6).
[5] Sharma, S., & Shah, D. J. (2003). A much
advanced and efficient lane detection algorithm for
intelligent highway safety. Computer Science &
Information Technology, 51.
[6] S. Cheung, A. Chu. Make and Model
Recognition of Cars.
https://cseweb.ucsd.edu/classes/wi08/cse190-
a/reports/scheung.pdf
[7] Viola, P., & Jones, M. (2001). Rapid object
detection using a boosted cascade of simple features.
In Computer Vision and Pattern Recognition, 2001.
CVPR 2001. Proceedings of the 2001 IEEE
Computer Society Conference on (Vol. 1, pp. I-511).
IEEE.
[8] CS231n lecture notes
[9] Srivastava, N., Hinton, G. E., Krizhevsky, A.,
Sutskever, I., & Salakhutdinov, R. (2014). Dropout:
a simple way to prevent neural networks from
overfitting. Journal of Machine Learning Research,
15(1), 1929-1958.

You might also like