Seonho DNIS13
Seonho DNIS13
Abstract. Vision-based traffic flow analysis is getting more attention due to its
non-intrusive nature. However, real-time video processing techniques are CPU-
intensive so accuracy of extracted traffic flow data from such techniques may
be sacrificed in practice. Moreover, the traffic measurements extracted from
cameras have hardly been validated with real dataset due to the limited availa-
bility of real world traffic data. This study provides a case study to demonstrate
the performance enhancement of vision-based traffic flow data extraction algo-
rithm using a hardware device, Intel Viewmont video analytics coprocessor,
and also to evaluate the accuracy of the extracted data by comparing them to
real data from traffic loop detector sensors in Los Angeles County. Our experi-
mental results show that comparable traffic flow data to existing sensor data can
be obtained in a cost effective way with Viewmont hardware.
1 Introduction
As a robust traffic monitoring system becomes an urgent need to improve traffic
control and management [1], many techniques have been proposed for traffic flow
data extraction. Traffic flow data such as the count of passing vehicles and their
speeds can be obtained from various devices like loop detector sensors, radars, and
infrared detectors, to name a few [2]. The most widely used sensor type is loop detec-
tor installed on the surface of road to detect the movement of passing vehicles over it.
One of the main shortcomings of these under-pavement traffic loop detectors is that
they are expensive to install and maintain. Moreover, they cannot be replaced or
fixed without disturbing traffic. Therefore, researchers have been studying computer
vision-based techniques to extract traffic flow data from traffic monitoring cameras.
Video sensor requires less installation and maintenance cost. Furthermore, video sen-
sors can monitor a large area in multiple lanes and be also useful in analyzing vehicle
classification and accident detection as well as extracting speed and vehicle counting.
However, there are two challenges with this approach. First, since image-processing
techniques are CPU-intensive, for real-time traffic flow data extraction from traffic
A. Madaan, S. Kikuchi, and S. Bhalla (Eds.): DNIS 2013, LNCS 7813, pp. 150–160, 2013.
© Springer-Verlag Berlin Heidelberg 2013
Real-Time Traffic Video Analysis Using Intel Viewmont Coprocessor 151
videos, accuracy may be sacrificed. Second, the traffic measurements extracted from
cameras have not been fully validated with real dataset.
This paper investigates the solutions of the above two challenges. First, we propose
a vision-based traffic flow data extraction algorithm to support multi-channel live
traffic video streams in real-time utilizing newly developed Viewmont video analytics
prototype coprocessor by Intel, Corp. Second, to validate the effectiveness of the
vision-based algorithm, we compare our results with the real data from loop detector
sensors in Los Angeles County. The Integrated Media Systems Center at the Universi-
ty of Southern California is working with the Los Angeles Metropolitan Transporta-
tion Authority (LA-Metro) to develop the data management and analytics systems that
will form the basis for building a large-scale transportation data warehouse for hete-
rogeneous transportation data (e.g., traffic flows data recorded by loop detectors and
videos from CCTV cameras, etc.). Through LA-Metro, we are acquiring real traffic
flow data from thousands of loop detector sensors all around the Southern California.
Our experimental results demonstrate that our proposed algorithm with dedicated
video coprocessor is capable of processing multiple video channels simultaneously in
real-time so a cost effective implementation of vision-based traffic flow data extraction
system would be feasible. The results also show that our vehicle counting and speed
estimation are comparable with those from loop detectors in Los Angeles County.
The remainder of this paper is organized as follows. In Section 2, we present back-
ground and related work. Section 3 describes our real-time video analysis algorithm
using Intel Viewmont. Experimental results and efficiency analysis are reported in
Section 4. Finally, we conclude our work with future directions in Section 6.
Tripline framework [8] is fo ollowed to process the traffic video. Our algorithm needds to
define a region of interest (R
ROI), which is a portion of image where actual analysis hhap-
pens in order to reduce thee overhead of computing-intensive video analytics proccess.
Lane separation is also definned in this region. ROI is a rectangular area in the video fraame
where one can see the trafficc most clearly. Within the ROI, virtual lines are also definned,
which are used to detect how w a vehicle passes each lane. Defining ROI and virtual liines
should be done manually by y a human operator but it is a one-time job at the beginnning
since traffic monitoring cam
meras are fixed.
Figure 1 shows an exam mple of ROI and virtual line definition. The rectangularr re-
gion is the ROI and for eacch lane, there are two virtual lines. One of them is used for
counting passing vehicles when
w the cars are crossing. The algorithm also records the
time whenvehicle hits each h line and based on this information, the vehicle speedd is
calculated using the passing
g time between two virtual lines.
After applying backgrou und extraction and foreground motion detection techniqques
to incoming video frames, moving vehicles are represented as moving blocks. O Over
each virtual line, moving blocks
b (i.e., vehicles) are detected by examining the vaaria-
tion of motion pixel values along time.
Background extraction technique
t follows the idea of frame average method [9].
Background is initially defiined as the first frame of the video. Afterwards, it is coom-
pared to every new frame and
a updated according to the pixel value difference. Speecif-
ically, the background pixeel value is moved towards the current frame by a certtain
amount ∆ if there is pixel value
v difference at the location. In this way, a stable baack-
ground is obtained by conttinuously updating after a certain number of frames. T This
number may vary accordin ng to the selection of ∆. A typical background updatting
process is demonstrated in Figure
F 3.
with a predefined value T to construct the binary Motion Image (MI). T should be
chosen wisely to identify the moving object while suppressing background noise.
Figure 4 shows an example of this process.
Morphological operations, such as open and close, are sequentially applied to the
ROI of the obtained motion image in order to suppress the noise and render the mov-
ing vehicles as simple white blocks to make the detection of moving blocks easier.
The intermediate steps of this process are shown in Figure 5.
The percentage of motion points (white pixels in the binary images in Figure 5.c) over
each virtual line is examined to determine whether a vehicle is passing by or not. For
instance, in the case shown in Figure 5, there is a car passing by, indicated by the high
percentage of motion points over the corresponding virtual line. Each lane will have
one flow rate status indicator to monitor whether there is a car over the virtual line at
a specific moment. To count vehicles, the algorithm tracks the percentage value p,
(i.e., how much portion of virtual line meets motion points) and uses its temporal
variation over a series of consecutive frames to determine the exact time when a
vehicle hits a certain line. Two predefined percentages p1 and p2 (p1 < p2) are also
used to control the flow status over the virtual lines. For each virtual line, initially p
Real-Time Traffic Video Analysis Using Intel Viewmont Coprocessor 155
equals to zero. When p goes greater than p2, the flow status is set to 1, which means
there is a vehicle passing by. When p decreases to less than p1, the flow status is then
set to 0, meaning that the vehicle has left the virtual line and the vehicle count is in-
creased by one. In our algorithm and experiments, p1 and p2 are set to 20% and 50%,
respectively.
In order to estimate the speed of moving vehicles, the algorithm adopts a two-line
approach. For each lane on the road, two parallel virtual lines are manually positioned
so that the distance between them is known. This distance is inferred according to the
Manual on Uniform Traffic Control Devices [10] published by US Highway Adminis-
tration. For each line, vehicle detection is performed independently, and the frame
index in video, when a vehicle hits the virtual line, is recorded. This process results in
two independent records for both parallel virtual lines. Then, the difference between
frame indices of two virtual lines leads to the time difference with the knowledge of
video frame rate, which is usually fixed (e.g., 30 frames per second). Provided with
both the actual distance and the time difference, the speed for moving vehicle is cal-
culated. The overall speed is obtained by averaging all detected speeds within a statis-
tic window over all the lanes. Figure 6 illustrates the two-line approach.
pipeline. When handling multiple channels, a thread is created for each input video chan-
nel in order to utilize parallel CPU execution. Currently, Viewmont has four physical
connections for up to four concurrent input video streams.
4 Experimental Results
In the experiments, we extract traffic flow data from real-time traffic monitoring vid-
eos captured along California freeways. For the evaluation of our approach, the ex-
tracted results are compared with those from the actual loop detector sensors installed
on the freeways, provided by LA- Metro. The experiments include 16 test cases of
different locations where loop detectors are positioned close to traffic monitoring
cameras. One difficulty in testing was the fact that current traffic monitoring cameras
were not installed for video analysis, but for human monitoring purpose. Thus, most
cameras do not have appropriate angles to be applied for video analysis. Moreover,
the locations of cameras and loop detectors are quite different so it is not easy to di-
rectly compare traffic flows on different locations. We carefully selected 16 locations
where the locations of camera and loop detector are close to make the comparison
meaningful. Furthermore, these cases include various different situations and envi-
ronments (i.e., cloudy, rainy, or sunny weather, sparse and heavy traffic) to evaluate
our algorithm. The test videos were recorded during daytime. One example of test
cases is illustrated in Figure 7.
(a) (b)
Fig. 7. Example of a good test case: (a) camera (blue dot) and loop detector sensor (red dot) are
located very close, (b) good camera angle and low traffic condition
The positive value of error rate means the count value is greater than the ground truth
value. For the average speed output, there is no ground truth value. Thus, the differ-
ence between the output of this approach and that from loop detector sensor was re-
ported. The smaller difference indicates that our approach can produce comparable
output to sensors.
Currently the vision algorithm in our approach works fine when the camera angle
is good and there is no shadow. Under these two conditions and when traffic is sparse,
the averages of counting errors of our approach and loop detector sensor are 3.04%,
6.44%, respectively, as shown in Figure 8. But when traffic is heavy, the averages of
counting errors are 14.69% and 8.06%, respectively. This shows that the results of
this approach are good when traffic is sparse, and get worse when traffic is congested.
Overlapping vehicles and their connected shadows due to back-to-back traffic can
cause bigger errors in vision-based approaches. The speed results are reasonable
because they are very similar to the sensor results. Under sparse traffic condition, the
speed difference is 5.59 MPH while it is 4.82 MPH when traffic is heavy as shown in
Figure 9.
20
15 Proposed
10 Approach
5
Loop
0 Detector
Sparse Dense sensors
Traffic Traffic
6
5.5
5
Proposed
4.5 Approach
4
Sparse Dense
Traffic Traffic
Fig. 9. Averages of speed difference between our results and sensor outputs
158 S.H. Kim et al.
Table 3. Time cost to process a frame at each channel using CPU alone
References
1. Cheung, S.-C., Kamath, C.: Robust Techniques for Back-ground Subtraction in Urban
Traffic Video. In: Proc. SPIE, vol. 5308, pp. 881–892 (2004)
2. Harvey, B.A., Champion, G.H., Deaver, R.: Accuracy of traffic monitoring equipment
field tests. In: IEEE Vehicle Navigation and Information Systems Conference (1993)
3. Fishbain, B., Ideses, I., Mahalel, D., Yaroslavsky, L.: Real-Time Vision-Based Traffic
Flow Measurements and Incident Detection. In: Real-Time Image and Video Processing
(2009)
4. Federal Highway Administration, Traffic Detector Handbook, 3rd edn., vol. I (2006)
5. Sanders-Reed, J.N.: Multi-target, Multi-Sensor Closed Loop Tracking. In: Proc. SPIE,
vol. 5430 (2004)
6. Hsu, Y.C., Jenq, N.H.: Multiple-Target Tracking for Crossroad Traffic Utilizing Modified
Probabilistic Data Association. In: Acoustics, Speech and Signal Processing (2007)
7. Gartner, N.H., Rathi, A.J., Messer, C.J. (eds.): Revised Monograph on Traffic Flow
Theory: A State-of-the-Art Report. Special Report by the Transportation Research Board
of the National Research Council (2005)
8. Vandervalk-Ostrander, A.: AASHTO Guidelines for Traffic Data Programs, 2nd edn.
(2009)
9. Wang, G., Xiao, D., Gu, J.: A Robust Traffic State Parameters Extract Approach Based on
Video for Traffic Surveillance. In: IEEE International Conference on Automation and Lo-
gistics (2008)
10. Highway Administration, U.S.: Manual on Uniform Traffic Control Devices,
http://mutcd.fhwa.dot.gov/htm/2003r1r2/part3/part3a.htm#sect
ion3A05
11. Lin, C.-P., Tai, J.-C., Song, K.-T.: Traffic Monitoring Based on Real-time Image Tracking.
In: IEEE International Conference on Robotics and Automation (2003)
12. Batista, J., Peixoto, P., Fernandes, C., Ribeiro, M.: A Dual-Stage Robust Vehicle Detection
and Tracking for Real-time Traffic Monitoring. In: IEEE International Conference on In-
telligent Transportation Systems Conference (2006)