Visual Algorithms For Autonomous Navigation: April 1985
Visual Algorithms For Autonomous Navigation: April 1985
net/publication/3979221
CITATIONS READS
21 228
4 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Larry S. Davis on 08 February 2014.
Fred P. Andresen
Larry S. Davis
Roger D. Eastman
Subbarao Kambhampati
856
CH2152-7/85/0000/0856$01.OO 0 1985 IEEE
through a generally unfamiliar environment with the aid 2. LANDMARK BASED VEHICLE POSI-
of a low resolution map (of thesortthatonemight TIONING
obtainuponentrance to a nationalpark).Atthe so-
An autonomous vehicle must have the capability of
called long range, one would first generate a plan for the
computing its current position for a variety of reasons -
day’s outing,identifyingthestartinglocation,the goal,
e.g., to be abletoanticipate, on the basis of available
anda low resolution path formovingfrom the start to
cartographicdata,importnnteventsthatare likely to
the goal. This path would be chosen based on gross con- occur (road intersections, bends, changes in terrain, etc.)
siderations of the terrain to be crossed and the capabili- and to use these expectations to guide its sensory process-
ties of the “vehicle”.Fromtimetotime,duringthe ing. The position of the vehicle can be determined using
course of the outing, one may want to establish his posi- many different approaches.The vehicle will have an
tionwithrespecttothe longra.nge plan. This couldbe
onboardinertialnavigationsystem.Althoughsuch sys-
accomplished by visually identifying landmarks of known
tems are very accurate, they do suffer from some degree
location,andthentriangulating to determinecurrent of drift which can build up to substantial errors over long
position. Section 2 of this paper describes a vision system
distances. The vehicle might also have access to a satel-
for position determination that we have developed as part
litepositioningsystemsuch as theGlobalPositioning
of thisproject.Amoredetaileddescription is available
System. While such a system does not suffer from drift,
in Andresen and Davis[l].
its accuracy is not as high as an inertial system; further-
At the intermediate range, one would look ahead to more, since such systems depend on components that are
determine generallysafedirections of travel. This would external to the vehicle, there is no guaranteethatthey
involveassessing thenature of theterrainin one’s will be available when needed.
immediate visual environment and identifying those por- It is also possible tocomputethe position of the
tions of that environment through which it is feasible to
vehiclebyvisually identifyingobjects of knownscale,
move. We refer to those navigable
portions of the
location and appearance and then simply trizngdating to
environment as corridors o f f r e e space. In the context of determine the vehicle’s
position. Qrdinarily,one has
thecurrentset of projectmilestones,corridors of free
some initial estimate of the vehicle position (perhaps from
spacenaturallycorrespond to segments of road in the
the inertial system) and one wishes to improve the accu-
vehicle’s field of view. Wehave developedasystem of racy of that estimate using visual landmark recognition.
algorithmsthatcan, in many cases, detectand follow
roads in sequences of imagery.Theyare basedprinci- A collection of algorithms for such a system has been
pally on an analysis of dominant linear features extracted designed and partially implemented in a research environ-
from the individual frames in the image sequence. These ment.Thesystem uses the knowledge of the vehicle’s
algorithms are currently being evaluated on a large data- approximate position to visually locate known landmarks.
base of images acquired at Martin Marrietta’s test site in It then triangulates using the bearings of the known land-
Denver. The details of the algorithm design are included marks to acquire a new position with a reduceduncer-
in Waxman et. a@. tainty.
Finally,shortrangenavigation is the process that, Weassumethe vehicle’scamera is mounted on a
based on a detailed topographic analysis of one’s immedi- computercontrolledpanandtiltmechanismand has a
ateenvironment,enables us to determinesafepositions computeradjustable focallength.Wealsoassume esti-
for footplacement,and to navigatearoundobstacles in mates are available for the heading of the vehicle, as well
thecurrent corridor of freespace.Humannavigators as the current settings of the pan, tilt, and local length of
probably use stereo and motion vision to derive the three the camera. A database of landmarks exists that includes
dimensional information needed to solve these problems. all pertinent landmark qualities, such as size and position,
The vehicle,which is a wheeledvehicle as opposed to a and at leastonerepresentation of eachlandmarkfrom
legged vehicle, will be equipped with a laser range sensor, which it could be recognized in an image.
currently
being
constructed at theEnvironmental The system is composed of three modules, called the
Research Institute of Michigan(Zuk and Dell’eva[3]), MATCHER,theFINDER,andtheSELECTOR,that
which is capable of acquiringtwo 100 x 100 arrays of interacttoestablishthe vehicle’sposition witha new
range data per second. This range data can, for example, level of uncertainty.
be converted into a simple array in which regular patches
of terrainare classified as either“navigable” or “non- I) The MATCHER locates likely positions for one or
navigable” andthenappropriatepathplanning algo- morelandmarks in animage,andratesthesepositions
rithms can determine what we refer to as a track of safe according to some measure of confidence.
passage andgeneratethecorrespondingmotioncontrol 2) The FINDER controls the pointing direction and
algorithms to the pilot of the vehicle to negotiate that focal lengih of the camera to acquire specified images for
track. In Section 3 of this paper we describe a quadtree a set of landmarksanddirectstheMATCHERto find
based path planning algorithm whichcould serve as the possiblepositionsfortheselandmarksinthe images. It
basis for identifying such hacks of safe passage. A more theneliminatespossiblepositionsforindividualland-
detailed description with extensive examples is presented markswhicharenotconsistentwiththe possible posi-
in Kambhampati and DavisIl]. tionsfoundfor otherlandmarks.TheFINDERthen
857
evaluates the remainingpossibleposit,ions to determine make best use of the most informative points. The mew
the actual positions of the given landmarks.
3 ) The SELECTOR identifies a set of landmarks
ure used is ~ ''
PI li2
le where P[ G It is the probability that,
whose recognition in images of appropriate angular reso- gradient direction 6 occurs in the template and k[ G 1; is
lution would improve the position estimate of the vehicle theprobabilitythatgradientdirection G occursin the
by the desired amount. It thendirectstbeFINDER to image. The actual probabilities are extracted from histo-
establish likely positions in such images for subsets of gramsofthetemplateandthe image.Basedon this
thoselandmarks.Withthese positions, the SELEC'H'OR measure, only the mostinformative 15 percent of the
then computes new estimates of the vehicle position and edge points in the image are used in the matching. Ccnse-
position uncertaintyanddirectstheFINDER, if neces- quently, boundary points in bhe template whose gradient
sary, to locate additional subsets of landmarks. directions are not selected will not be used.
Section 2.1 - 2.3 describe the MATCHER, FINDER, It canbeseenthatgradientdirectionsthat occur
and SELECTOR, repectively. often in the template but infrequently in the image would
rate veryhigh on this scale. Also, gradientdirections
with few occurencesin thetemplatehutmany in the
A generalized Hougb transform (51 is employed to imagewould rate very low. Pointswith such gradient
locatelandmarks of knownimageorientationand scale. directions would yield a bigh number of unrelated votes,
The landmarksarerepresented by lists of boundary cluttering the Hough space and creating false peaks.
points; these points are then individually matched to edge
points in the image. The algorithm consists of three
mainphases:edgepoint detection, matching of the text- Thissectiondescribe a strategy (the FINDER) for
piate to the edge points,andinterpretingtheresults of determiningbearings to B given set oi landmarks. The
thematching. Edges aredetected as points where the FINDER is also given specifications for images in which it
Laplacimchanges sign and the iocak greylevels have a canexpect to find theselandmarks. It thencontrols the
high symmetric diberence. Matching is doneusing the camera to obtain these images and uses the hMTCHER
generalizedHough transform and is restrictedin two to establish likely positionsfor thelandmarks intheir
ways. First,templatepointsmatch only pointshaving respective images. Since 'she search for any specific land-
close tothesamegradientdirection.Second, only those mark mayresult in severat
possible
image
positions
template points are used whose gradient directions have a (which we will refer to from now on as ''peaks''] for that
high measure of informativeness;thismeasure is dis- landmark (at most one of which can be correct), a simple
cussed in more detail below. geometricconstraintpropagationalgorithm is employed
to eliminat,e many of the false peaks.
It can oft,en occur that one or several gradient direc-
tionsare so prevalent in the image thattheyproduce The geometric constraint propagation algorithm eon-
strongvotingclusters in Houglir, space at, incorrect loca- siderspossible peaks for a pair of landmarks and deter-
tions. If agradientdirectionoccurs xt N edgepointsin mines if they could bothbethecorrectpeaks fortheir
the imageand at M templateboundarypoints,then respectivelandmarks. Two possible peaks are called
M x N increments are made for that gradient direction in consistent if they meet this criterion. The details of this
the Hough space.Sinceusually only a smallfraction of consistency computationare described below. With con-
the edge points in the image arepart of the object's sistency determined for all pairs ob peaks, a graph is then
boundary,theremainder of thematchescanpotentially constructed in which nodes are ?eaks, and arcs represent
contribute to fake peaks in the Hough space. themutual consistencybetween twopeaks.Analysis of
thisgraphcandetermineconsistencyamonggroups of
Also, if a gradient direction is prevalent in the tem-
more than two peaks and therefore eliminate peaks based
plate, then, as agroup,pointswith that gradient direc-
on more than just pairwise inconsistency.
t,ion will contribute more correct votes than would points
withaninfrequentgradientdirection.This is because T o determine consistency between two peaks p , and
there will bemoreboundarypointswiththeprevalent p for landmarks L and L 2, we first calculate a range of
gradient. direction incrementing potential reference points. possible angular differences between L and I,, based on
Therefore, when the reference pointhappens t o be?,he the vehicie's positional nnccrtainty. We then extend this
correct
location
for the
object,
more of t h ev o t e range by the pointing error and check thst the measured
contributing to its peak will comefrom points with the angular differencebetween p 1 and p z fallswithinthis
prevalentgradientdirectionthanfrompointswithan range.
infrequent gradient direction. The angular difference between L and E is deter-
To usetheseobservations to best advantage, we minedbysimply taking the difference of their bearings.
developed a measure of gradient direction informativeness The range of angular differences is then obtained by let-
(CDI) to rate the gradient directions. Only those points ting the value for the current position vary according to
whose gradientdirections rate highly are used in the the position uncertainty of the vehicle. For the purposes
matching.Inthisway, we caneliminate the uninforma- of this analysis, we assume that the position uncertainty
tive sources of spurious patterns in the Ilough space and canberepresented by a soliddisc onthe local ground
858
plane. If we makethe reasonableassumption that nei- Givenadatabase of visual landmarks, a variety ob
ther landmark lies inside the disc, then it is easy to show strategies can beemployedto select a subset of those
that
the positions
which
give the maximum and landmarks for identification. The implementation of any
minimumangular differences will always lie on the cir- of these strategies requires the abilities t o determine both
cumference of the disc. From these positions we can then the ease of identification oi any given landmark and the
calculatedirectlythemaximumandminimumangular eifect of its identification on the vehicle’s position uncer-
difference between L and L 2. An abbreviated derivation tainty. Here, we consider only the latter; see [I]for a dis-
of an analytic solution for thesepointscanbefound in cussion of the former.
111. Given a pair of bearings ( B 1 , B 2for ) two landmarks
As mentioned above, the consistency
graph with known positions (z,,yl) and (z2,y2),we can find the
represents consistency relations between the possible loca- actual vehiclelocationbyintersecting the lines passing
tions for different landmarks. Ideally, we would want to through (zl,yl) with angle B , and (z2,y2)with angle B,.
determinethemaximalcompletesubgraphs (MCS’s) of See Figure la. If the bearing Bi to landmark 15,. is only
this graph because they would represent the largest sets known to within f B i , thenthe possible lines passing
of landmarklocations thatare allmutuallyconsistent. through ( z i ,yi 1 would sweep out a wedge Wi of angular
For small graphs this is practical, but for large graphs we width 26, on the groundplane.SeeFigure l b . Sincefor
mightbeforced,duetotimeconstraints,toperform a each landmark, Li , found the vehicle is constrained to lie
simpler analysis. in the planar wedge Wi, then the vehicle must lie in the
Wecan,forexample,applycertainsimpleiterative convexpolygon formed by the intersection of these
tests to thegraphthat would eliminateanylandmark wedges.See Figure IC.
location not part of a t least a k-clique. In what follows, The size and shape of this convexpolygon is deter-
we identifytwosimpletestsforeliminatingnodesnot minedby the width of eachwedge at theirintersection
part of k-cliques. These processes are similar t o so-called and the angles at which they intersect. The width Vi of
“discreterelaxation”algorithms - see, e.g., Naralick and wedge
a iq at a distance di from Ei is givenby
Shapiro [SI. Vi = 2.d; .tanOi, where is theuncertainty of the
First, wc can iteratively eliminate all nodes which do landmarkbearing.Therefore, the effect of finding Li’s
not have arcs to nodes representing at least k other dis- bearing on the vehicle location uncertainty is proportional
tinctlandmarks.Afterthis process is complete, we can to the angular uncertainty 6; of the bearing and the dis-
then eliminate all nodes which are not the center of what tancefrom Li totheactual vehiclelocation.Since the
we refer to as a k-fan. A node n is the center of a k-fan actual vehiclelocation is notknown at thispoint, we
if there exists a connected chain of nodes o? distinct land- approximate it by the assumed current position.
marks of length k-1 in which each element of the chain is To express one
in parameter the
uncertainty
connected to PI. Finally, we find all MCSs for this pruned represented by an arbitrary convexpolygon, wefind the
graph. twoverticeswhicharefurthestapart. Half of the dis-
tance between these two vertices is a reasonable approxi-
Sincewecouldendupwithseveral MCSs, we now
mation of the “radius” of this polygon.
need a way to determine which is the actual set of land-
mark locations. To do this, we define an evaluation func- 3. QUADTREE PATH PLANNING
tion to operate on the MCSs andthen pick the MCS
whichrespondsbest to the evaluationfunction. In our We are developing a system of algorithms for mobile
currentsystem, we
use a simplesummation of the robot path planning based on a multiresolution represen-
confidences for each of the possible peaks. tation of therobot’simmediateenvironment.The mul-
tiresolution
representation used is quadtree
the
(Samet[7]).Figure 2 illustratesthequadtreereprcsenta-
tion
for a simple binary
array
whereblack
points
2.3. The SELECTOR represent obstacle points and white points represent lree
This sectiondescribes a strategy(theSELECTOR) space. The quadtree is a recursive decomposition of that
for selectinga set of landmarks whoseidentification in arrayintouniformly colored (i.e., either black or white)
appropriate imageswouldimprove thecurrentestimate 2’ x2’ blocks. Thus, il there are large areas of free space
of the vehicle’sposition. The SELECTOR supplies sub- (or of obstacles) then those areas can be represented by a
sets of these landmarks, with
appropriate image few large blocks in the quadtree and can be dealt with as
specifications, totheFINDER which returnsthemost units by the planning algorithms.
likely relative positions for each landmark in each subset. The quadtree representation thus offers a comprom-
The SELECTOR then computes the Tlehicle’s actual loca- ise between a simplehomogeneousarrzyrepresentation
tionand the new uncertaintyassociatedwithit. If this (which is straightforward to construct but then computa-
new uncertainty is insufficient, then the SELEC‘1C.R can tionally
costly to
analyze)and
free
a space region
eithersimplyaccept the new uncertainty as the best representation (e.g., Brooks[8])which is morecostly to
achievable result, or try to further improve the position construct,but on the otherhandmoreeficient to
estimate using other landmarks. analyze.
859
Before discussing the planning alprithm, we should the optimal path through these blocks, or we can simply
point outthatthereare several important differences connect the center points of consecutive blocks 011 the list
between pathplanningrequirementsfora mobile robot to compute a path.
and for the more familiar manipulators (see also the dis- Figure 3 contains a simple example. Figure 3a is a
cussion in Thorpe[S]). For example: binaryarraywithstartand goal pointsmarked, along
1) A mobile robot may have only an incomplete model with anindication of thepathdetermined by the a l p
of itsenvironment,perhaps because itconstructsthis rithm.Figure 3b containsthetreedatastructurethat
modelusing vision and thus cannot determine what lies represents the quadtree, with the blocks on the computed
in the shadow of an object. path marked with P’s.
2) A mobile robot will ordinarily only negotiate m y
given pathonce (as opposed t o amanipulator which An important extension of this simple path planning
might perform the same Epecific task thousands of times). algorithm involves the ability to deal with grey nodes in
Therefore,it is more importantto develop a negotiable the quadtree (nonterminal nodes which always have both
path quickly thanit is to develop an“optimal”path, black andwhitedescendants). Dealing with grey nodes
which is usually a costly operation. can greatlyreducethenumber of blocks that the plan-
3) A mobile robot will be moving according t o a previ-
ning algorithm needs to consider in building an ioitial
ously computed path while it is computing an extension estimate of a path. Such an algorithm is presented in [4].
or modification to that path - Le., pathplanning for a
4. CONCLUSIONS
mobile robot is a continuous, online process rather than a
single, offline process. Theframework for visualnavigation that we have
presented in this. paper must itself beincorporatedidto
The algorithm that follows only addresses the second an even more comprehensive navigationframeworkthat
of the above threepoints.Apathgeneratedfrom a considers analyses of many other sources of information -
quadtree is a sequcnce of blocks through which it is possi- e.g., other sensors, maps, reconnaissance data, etc. Plan-
ble for therobotto move. Thedetailed motion within ning and executing navigationtasksatthis level will
any single block is not determined at this level; a default requirevery complex models for data fusion,resource
assumption of straight line motion through the block is allocation and problem representationand solving.
assumed. Although this will not ordinarily be an optimal important goal of theautonomousnavigationproject is
path, it will be a negotiable path. to structure this framework in such a way that it is possi-
Given the quadtree representation in which blocks of ble for a broad spectrum of research groups to contribute
0’s represent free space and blocks of 1’s represent obsta- to,andexperimentwith,the vehicle (or an appropriate
cles, we Erst compute the distance transjorm of the set of simulation of the vehicle).Achieving this goal would,
0’s. Thisdetermines, for each block of free space,the hopefully, lead toabetterunderstanding of notjust
minimal distance between thecenter of that block and vision as an isolated activity, but of vision as part of a
the boundary of a block of obstacles. Samet[7] describes more comprehensive intelligent activity.
an algorithm for computing the distance transform for a
quadtree. ACKNOWLEDGEMENTS: The
framework for
visual
The path planning algorithm itself is a simple A * navigation described in Section 1 of &hi3paper evolved
search algorithm with the evaluation function, f, defined duringmanymonths of discussion in theComputer
as follows: Vision Laboratory. Allen Waxman and Jacqneline
f = g + h LeMoigne made significant contributions to that work.
where g is the distance of the current node in the search
from the start node, and h is the heuristic estimate of the REFERENCES
goodness of theremainder of thepathpassingthrough
that node. The heuristic h is the difference of two com- 1. Andresen, F. and Davis, L., Vehicle positioning using landmark
ponents, k, and h d , where hd is thedistance of the identification, Wniversity of Maryland Center for Automation
Research Technical Report, in preparation.
nearestobstaclefromthecurrent node (determinedby
the distance transform algorithm) and hG is the straight 2. Waxman, A,, LeMoigne, J., andSrinivasnn, B., Visual navigation
line distance between the current node and the goal. of roadways, this proceedings.
In developing thesearch, only thehorizontaland
neighbors of any block are considered for 3. Zuk, D. and Dell’eva, M., Three-dimensional vision system for the
adaptive suspension vehicle, Environmental Research Institute Of
extensions of paths. Diagonalneighbors,which share Mchigan Technical Report, January, 1983.
only single points with the currer2 node, would result in
inflexible paths which clip comers. 4. Kambhampati, S. and Davis, L., Path planning in quadlrees,
University of Maryland Center for Automation Researeb Technical
Theresult of applyingthe A * algorithm to the Report, in preparation,
quadtree is a list of nodes from the quadtree (ordinarily
of varying Sizes) which define a set of paths between the 5. Ballard, D.,Generalizing the Hough transform to detect arbitrary
startnodeandthe goal. Wecan, if desired, determine shapes, Pattern Recognition, 19, 111-122, 1981.
860
6. Haralick, R. andShapiro, L., The consistent labe!ing problem,
IEEE Transactions on Pattern Analysie and Machine Intelligence, 1,
173-183,1979.
Figure 3 .
Flgure l a . Triangulation
GI
B.
Li
37383940 57585960
(d) p u a d t r e s r e p r e s e n t a t i o n a t theblocks .
in ( c )
Figure 2.
86 I