Optimizing The Embedded Platform Using OpenCV

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Optimizing the Embedded Platform using OpenCV

February 17, 2012


att !eber
"ml#eber1$ro%&#ell%ollins'%om(
2
)
*oals
)
Pro+e%t ,pproa%h - .esults
)
Future /deas
)
.eferen%es
0
)
1o 2uantify the effe%ts of the many optimizations
a3ailable and see #hat effe%t, if any po#er
management has
)
ost /mportant .e2uirements "/.s(

inimal startup and lo# laten%y pro%essing time

On4demand Po#er anagement


) 5a%&ground

6tilized a O,P0 pro%essor for image pro%essing

7inu8 2'9'0:'; <ernel #ith O,P P pat%hes

5uildroot #= Crosstool4ng tool%hain


Goals
;
)
Cost=5enefit

Compiler > Co4Pro%essor > Po#er anagement >


?pe%ialized Cores

?upporting soft#are "#hi%h &ernel, pa%&ages,


3endor libraries, et%(
) @efine ben%hmar&ing tool
) *ather metri%s for optimization methods applied to

Platform "<ernel=rootfs(

,ppli%ation

!ith po#er management a%ti3e


Project Approach
A
)
*ot%has
,re 5inary %ompatibility - ar%hite%ture "arm3A, 39, 37a''''(
mas&ing a problemB
,re your Platform - ,pp using the same tool%hainB
,re features li&e VFP "Ve%tor Floating Point( - ,d3an%ed ?/@
e8tension "a&a CEOC( enabledB
)
5uilding your o#n has some additional benefits
D ?our%e %ontrol - ability to re%reate=fi8 issues
D *eared to#ards your CP6 ar%h - hard#are FP6
D Could tailor &ernel headers to get a ne#er feature
Possibly in%orporate the latest 7inaro *CC
Project Approach: Compiler/Toolchain
Know your toolchain!
9
)
OpenC3 2'1

%3at%h1emplate"( algorithm as the test %ase


cvMatchTemplate( img, tpl, res, CV_TM_CCORR_NORMED );

7ots of matri8 math

Ea%h of the time measurements #ere +ust for the


algorithm e8e%ution and not the image load time

A'A5 image is sear%hed for the image of a small


boat
Project Approach: Benchmarking Tool
7
)
1estE Compiler Optimization
)
@es%riptionE <ernel and .ootfs are built #ith same flags belo#
and e8e%uting off an ?@Card'
) FlagsE
CFLA! "# $pipe $O%
)
.esultE F1:'0Ase% $G00hz
Project Approach: Metrics Test !
Compiler
G
)
1estE Compiler Optimization - use of hard#are %o4pro%essors
)
@es%riptionE <ernel and .ootfs are built #ith same flags belo#
and e8e%uting off an ?@Card'
) FlagsE
CFLA! "# $pipe $O% $m&p'#(e)( $&tree$vect)ri*e $m&l)at$a+i#s)&t&p
)
.esultE F;':1se% $G00hz
F7AH in%rease in performan%e
Project Approach: Metrics Test "
Compiler Co4Pro%essor
0
5
10
15
20
25
O3
O3 w/Neon
:
)
1estE Compiler Optimization - Po#er anagement
)
@es%riptionE <ernel and .ootfs are built #ith same flags belo#' Po#er
management is enabled to idle and fre2uen%y s%ale the CP6 on4demand
bet#een 000 and G00hz' /t uses the default s%aling trigger threshold for
the 2'9'0:'; &ernel'
"CoteE Purely ,. %ore instru%tions'(
)
FlagsE
$pipe $O%
) .esultE F1:'0:se% $0004G00hz
F;0mse% "2H( in%rease in pro%essing time #= P
) CommentE ?olely ,. instru%tions %ause the s%heduler to ha3e more
demand for a higher %lo%& speed earlier, so it results in a small in%rease in
the additional pro%essing time re2uired'
Project Approach: Metrics Test #
Compiler
Po#er
anagement
19.33
19.34
19.35
19.36
19.37
19.38
19.39
19.4
O3
O3 w/PM
T
i
m
e
(
S
e
c
o
n
d
s
)
10
)
1estE Compiler Optimization, %o4pro%essors and Po#er anagement
)
@es%riptionE <ernel and .ootfs are built #ith same flags belo#' Po#er
management is enabled to idle and fre2uen%y s%ale the CP6 on4demand
bet#een 000 and G00hz' /t uses the default s%aling trigger threshold for
the 2'9'0:'; &ernel'
"CoteE ,. %ore and Ceon instru%tions'(
)
FlagsE
$pipe $O% $m&p'#(e)( $&tree$vect)ri*e $m&l)at$a+i#s)&t&p
) .esultE FA'12se% $0004G00hz
F210mse% ";H( in%rease in pro%essing time #= P
) CommentE 7ess time spent e8e%uting ,. instru%tions, sin%e the Ceon
%ore is offloading some of the pro%essing, %auses more e8e%ution at 000hz
and a slight in%rease in pro%essing time'
Project Approach: Metrics Test $
Compiler
Po#er
anagement
Co4Pro%essor
4.8
4.85
4.9
4.95
5
5.05
5.1
5.15
O3 w/Neon
O3 w/Neon &
PM
T
i
m
e
(
S
e
c
o
n
d
s
)
11
) Finish testing #ith @?P and 1/ Code% Engine
/nitial tests #ith CE, 7P, @?P7/C<, 1/ Code% Engine are #or&ing
/ssues #ere found #ith the C9,%%el used in ?oC OpenCV @?P #or&
"ne#er 1/ libraries, &ernel and %ompiler issues'''''(
1/ measurements #ith /ntegra ?OC "floating point @?P( sho# a G9H
speed up for the mat%h template algorithm
Project Approach: %uture Tests
Compiler
Po#er
anagement
Co4Pro%essor
?pe%ialized
Cores
1! 1!
12
Project Approach: Per&ormance Metric 'ummary
The key to the ne(t step is controlling o&&loa)ing o*erhea)
Test Result (sec)
"1 #O3 19.35
"2 #O3 & Neon 4.91
"3 #O3 w/ PM 19.39
"4 #O3 & Neon w/PM 5.12
"5 #O3 & Neon w/PM
& $SP
%s&. '3.07
10
Project Approach: Power Management Test
) 1ools > ben%h po#er4supply and data logging multimeter
) ?tartup board "po#er4supply is set to a 1, limit at AV(
) First test is on4demand
,r))t-+'il.r))t /01 ech) 23444442 5 6s7s6.evices6s7stem6cp'6cp'46cp'&re86scali(g_ma9_&re8
,r))t-+'il.r))t /01 ech) 2)(.ema(.2 56s7s6.evices6s7stem6cp'6cp'46cp'&re86scali(g_g)ver()r
cp'&re8$)map: tra(siti)(: 344444 $$5 %44444
,r))t-+'il.r))t /01 ;6)pe(cv_templatematch
<OR=>N555
cp'&re8$)map: tra(siti)(: %44444 $$5 344444
?;@A4444 sec)(.s )& pr)cessi(g
t@: %A4444 tA: ?B44444
Cl)cCspersec: @444444
cp'&re8$)map: tra(siti)(: 344444 $$5 %44444
,r))t-+'il.r))t /01
) ?e%ond test is userspa%e set fre2uen%y
,r))t-+'il.r))t /01 ech) 2'serspace2 5 6s7s6.evices6s7stem6cp'6cp'46cp'&re86scali(g_g)ver()r
,r))t-+'il.r))t /01 ech) 23444442 5 6s7s6.evices6s7stem6cp'6cp'46cp'&re86scali(g_setspee.
cp'&re8$)map: tra(siti)(: %44444 $$5 344444
,r))t-+'il.r))t /01 ;6)pe(cv_templatematch
<OR=>N555
D;E@4444 sec)(.s )& pr)cessi(g
t@: @@4444 tA: ?4A4444
Cl)cCspersec: @444444
,r))t-+'il.r))t /01
1;
) CoteE the @?P adds an additional F07Am!, sho#n in yello# - pre3ents the
,. from s%aling up to G00hz' 1he %hart sho#s only an estimate of @?P
po#er dra#IAJ and an appro8imate timeline from 1/ #hitepaper findings'
) /f an O,P *P6 options #as added, the appro8 po#er dra# #ould in%rease
by F:0m!' !eKre not sure yet ho# mu%h o3erhead this #ould %ause on
the ,.'''
Project Approach: +nitial Power Measurements
1 2 3 4 5 6 7 8 9 10 11 12
0
0.5
1
1.5
2
2.5
3
(e)*+e(o),d-M # O.en/0 Tem.+)&e M)&c1 Powe, $,)w
23M4800M15
23M4300#800M15
%s&. 23M4300#800M15 & $SP
Time(Seconds)
P
o
w
e
,

(
w
)
&
&
s
)
O,i*. P,ocessin*
1A
) /n3estigate the ne# issues of Po#er anagement in a multi4
%ore #orld
Lo# %ould load statisti%s be maintained for dynami% po#er %ontrol
a%ross %oresB
aybe add hoo&s into e8isting CP6Fre2 frame#or& for on4demand
based on anti%ipated %ompletion from other %oresB !hat if 7inu8 on
the primary CP6"s( suspended #hile the offloaded tas& is being
pro%essedB
%uture +)eas
7!
19
) *soC pro+e%tE OpenCV @?P ,%%eleration "2010(
D /n3estigate OpenCV %ode issues "lots of floating point and ?17(
D *ather po#er, timing and laten%y=/PC o3erhead numbers using the
1/ Code% Engine approa%h
Possibly implement %ustom @?P approa%h based on results
) *P6
D /n3estigate "future( ?*M *raphi%s ?@< #ith OpenC7 support
D Currently the only published 3endor supporting OpenC7 is Nii7,5?
"N? ?OC( and 1/ "O,PA(
%uture +)eas
17
)
Lard#are
D 5eagleboardM
D
"optional( 7/4A00 %amera
) .epository - !i&i
in%ludes 8loader, uboot, sd%ard s%ripts, &ernel - rootfs, test se2uen%es
git:66gith'+;c)m6mattheF$l$Fe+er6+'il.r))t;git
https:66gith'+;c)m6mattheF$l$Fe+er6+'il.r))t6FiCi
) 5uildroot O3er3ie#
http:66&ree$electr)(s;c)m6p'+6c)(&ere(ces6A4@@6elce6'si(g$+'il.r))t$real$
pr)Gect;p.&
Project +n&ormation
1G
I1JhttpE==###'ti'%om=lit=#p=spry17A=spry17A'pdf
I2JhttpE==###'ti'%om=lit=#p=spry1;;=spry1;;'pdf
I0JhttpsE==%ode'google'%om=p=open%34dsp4
a%%eleration=#i&i=*etting?tarted1
I;JhttpE==old'nabble'%om=.e2uest4for4%omments4on4pa%&ages4for41/
H27s4O,P04and4@09A4pro%essors4td2:7;1229'html
IAJhttpE==pro%essors'#i&i'ti'%om=inde8'php=O,P0A00OPo#erOEstimatio
nO?preadsheet
I9JhttpE==###'sa&oman'%om=O,P=an4o3erie#4of4omap04po#er4
management4#ith4290:4pm'html
I7JhttpE==###'ti'%om=general=do%s=#tbu=#tbugen%ontent'tspB
template/dP9120-na3igation/dP11:GG-%ontent/dP;90G
Cre)its/,e&erences

You might also like