Decision-Making For Automated Vehicles Using A Hierarchical Behavior-Based Arbitration Scheme

Decision-Making for Automated Vehicles Using a
Hierarchical Behavior-Based Arbitration Scheme

1,2 2 2
Piotr F. Orzechowski , Christoph Burger , and Martin Lauer
Abstract—Behavior planning and decision-making are some of such scenario- and methodology-specific approaches into
of the biggest challenges for highly automated systems. A one understandable and traceable architecture.
fully automated vehicle (AV) is faced with numerous tactical How and when should an AV switch from a regular ACC
and strategical choices. Most state-of-the-art AV platforms
controller to a lane change, cooperative zip merge or parking
arXiv:2003.01149v4 [cs.RO] 5 Feb 2021
are implementing tactical and strategical behavior generation

using finite state machines. However, these usually result in planner? How can we support POMDPs, hybrid A* and any
poor explainability, maintainability and scalability. Research other planning method in our behavior generation?
in robotics has raised many architectures to mitigate these Most state-of-the-art AVs that have at least proven suc-
problems, most interestingly behavior-based systems and hybrid cessful in the DARPA Urban Challenge [5]–[7] or during test
derivatives.
rides on public roads [8], [9] have used finite state machines
Inspired by these approaches, we propose a hierarchical
behavior-based architecture for tactical and strategical behavior (FSMs) for tactical and/or strategical behavior generation.
generation in automated driving. It is a generalizing and scal- FSMs are a useful tool for simple systems with a small
able decision-making framework, utilizing modular behavior number of behavior options and maneuvers where each state
blocks to compose more complex behaviors in a bottom-up represents one maneuver or driving mode. In practice FSMs,
approach. The system is capable of combining a variety of
even hierarchical FMSs, turn out to be unsuitable for more
scenario- and methodology-specific solutions, like POMDPs,
RRT* or learning-based behavior, into one understandable and complex tasks due to their poor explainability (about the rea-
traceable architecture. We extend the hierarchical behavior- son why a certain behavior is executed), maintainability (the
based arbitration concept to address scenarios where multiple effort to refine existing behavior) and scalability (the effort
behavior options are applicable, but have no clear priority to achieve a high number of behaviors). These shortcomings
among each other. Then, we formulate the behavior generation
motivate the search for other architectures that can be used
stack for automated driving in urban and highway environ-
ments, incorporating parking and emergency behaviors as well. for tactical and strategical behavior generation.
Finally, we illustrate our design in an explanatory evaluation. Decision-making is a well known research field in robotics,
also referred to as “robot control” or “action selection” [10].
I. I NTRODUCTION Generally, the various approaches can be classified into
knowledge- or behavior-based systems.
Recent years have shown significant progress in the field Knowledge-based systems, like FSMs, typically perform
of automated driving and advanced driver assistance sys- the action selection in a centralized, top-down manner using
tems. While considerable improvements have been achieved a knowledge database that contains a fused and abstracted
in perception due to advances in deep learning and other representation of all available sensor data. As a result, the
AI technologies, behavior planning and decision-making re- engineer designing the action selection module (in FSMs the
mains one of the biggest challenges for highly automated state transitions) has to be aware of the conditions, effects
systems. In urban driving, traffic participants are faced with and possible interactions of all behaviors at hand.
numerous tactical and strategical choices. Humans decide in Behavior-based systems, on the other hand, decouple ac-
most of these situations, like stopping at a zebra crossing, tions into atomic simple behavior blocks that should be aware
choosing an appropriate gap when merging or yielding at of their conditions and effects themselves. These modular be-
intersections, reactively. Long-term decisions, like goal and havior blocks are then combined to more complex behaviors
route selection or the choice of driving style and behavior in a bottom-up approach. Many architectures for behavior
preferences, consider longer time horizons, though. coordination have been proposed. The most prominent are
For some scenarios, considerable results in behavior and the subsumption architecture [11], voting systems [12] and
trajectory planning have already been achieved [1]–[4]. How- activation networks [13].
ever, no generalizing and scalable decision-making frame- In this publication, we propose a hybrid approach combin-
work has been found that is capable of combining a variety ing the best from both worlds: A hierarchical behavior-based
architecture for tactical and strategical behavior generation
1
Mobile Perception Systems, FZI Research Center for Information Tech- in automated driving. We combine atomic behavior blocks to
nology, Karlsruhe, Germany orzechowski@fzi.de
2 more complex behaviors using generic arbitrators. Arbitrators
Institute of Measurement and Control Systems, Karlsruhe Insti-
tute of Technology (KIT), Karlsruhe, Germany {christoph.burger, can again be combined with other arbitrators or behavior
martin.lauer}@kit.edu blocks to generate an even more complex system behavior.
10.1109/IV47402.2020.9304723 © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses,
including reprinting/republishing this material for advertising or promotional purposes, collecting new collected works for resale or redistribution to
servers or lists, or reuse of any copyrighted component of this work in other works.
We explain the promising concept in detail and show early executes its options based on a fixed predefined order. A ran-
simulation results. Our approach has been inspired by a dom arbitrator assigns probabilities to its behavior options
similar, very successful, approach in robot soccer [14]. and selects one among all applicable options randomly.
The main contributions of this publication are: Additionally, a novel cost-based arbitration scheme that
• an architectural design for AV behavior generation using is necessary for, but not limited to automated driving is
a hierarchical behavior-based arbitration scheme, by introduced in section III-D.
– extending the existing arbitration approach, Finally, to generate even more complex behavior, an arbi-
– developing a suitable maneuver representation, trator can also be a behavior option of a hierarchically higher
– defining a set of fundamental driving behaviors and arbitrator.
– combining these to an overall system behavior using C. Design Process
arbitrators. We want to briefly highlight some valuable properties of
• Early experimental results in the CoInCar-Sim [15]. the design process when using the hierarchical behavior-
II. F UNDAMENTALS based arbitration scheme.
The first step is to define a minimal set of basic behavior
A first concept of hierarchical behavior-based arbitration blocks to tackle the given task. In order to do so, one
schemes for behavior generation has been presented in detail should think in a bottom-up approach, meaning a functional
in [14]. This chapter highlights the main ideas. rather than scenario perspective. In this sense, one behavior
The concept is based on simple modular behavior blocks block can be applicable in multiple scenarios, while each
and generic arbitrators. scenario could also be tackled by various or sequential
A. Behavior blocks — How to do things behavior blocks. It might also make sense to design multiple
behavior blocks to achieve the same behavior with different
Behavior blocks are the fundamental building blocks of
approaches, e.g. two behavior blocks for follow ego lane: one
a behavior-based architecture. They describe how and when
behavior block using state lattices, the other optimization.
things can be done.
More importantly, each behavior block can be developed
A behavior block provides three main functionalities:
independently by defining its invocation and commitment
invocation condition Indicates if this behavior is applicable conditions and implementing the command function.
in the current situation. In the second step, these behavior blocks are combined
commitment condition Signalizes if a currently active be- with an arbitrator while arbitrators can be further stacked
havior could be continued. to a hierarchical graph. Here, scenario specific knowledge
command Generates the actual behavior output that can can be used to find a good behavior selection strategy.
be passed on to a subsequent execution pipeline or Each arbitrator can also decide if an active behavior will be
the actuators. This could be a trajectory, turn signal, interrupted in favor of a better option, even if the commitment
gripping target, etc. condition is true. Nevertheless, none of the behavior blocks
Only if either the invocation or commitment condition is true, has to be modified to be added into the arbitration graph.
the behavior can be selected and its command function can If this initial graph turns out not to be sufficient, it can be
be called. easily extended by defining a new behavior block and adding
B. Arbitrators — Which thing to do it to one of the arbitrators. None of the existing behavior
blocks has to be modified to achieve this. The graph can
Arbitrators hierarchically combine behaviors to produce also be reordered or new arbitrators introduced seamlessly.
more complex behavior strategies. They decide which thing Finally, the decoupling of behavior blocks confines poten-
to do. tial errors and enables proper unit testing. Additionally, the
An arbitrator contains a list of behavior options to choose arbitrators are so simple that a formal verification should be
from. Each behavior option offers abstract information like possible. Both are important steps towards functional safety.
the invocation and commitment condition, which the arbitra-
tor uses to decide which option to execute. III. A PPLICATION TO AUTOMATED DRIVING
Any problem specific knowledge and environment in- This chapter describes the main contribution of this publi-
terpretation is completely encapsulated inside the behavior cation: how a hierarchical behavior-based arbitration scheme
block itself. As a result, arbitrators do not need any knowl- can be utilized for decision-making in automated driving.
edge about the nature of their underlying behavior options, In contrast to classical behavior-based systems each be-
but choose behaviors based on abstract information only. This havior block is not directly connected to the sensors and
bottom-up design approach leads to strong functional and actors. Instead, the input is an abstract environment model
semantic decomposition. that contains a fused, tracked and filtered representation of
Arbitrators can utilize various schemes to select between the world. The behaviors’ output is also in a more generic
their behavior options. The following have been proposed: form that can be passed to a trajectory planner or controller.
The highest priority first arbitrator organizes its behavior In this sense, we follow the sense-plan-act paradigm in the
options in a list ordered by priority. An applicable option overall software structure [10] but employ a behavior-based
with the highest priority is chosen. The sequence arbitrator approach in the decision-making module.
predictions as well as virtual objects indicating stop positions.
Finally, the maneuver variant defines the chosen homotopy
class, as discussed in [18]. An example of a corridor-based
driving command is shown in Fig. 1.
Driving commands in unstructured environments directly
use a trajectory to represent the requested maneuver. We
did not choose a more abstract representation in this case,
in order to support a wide variety of use cases in such
environments.
Depending on the command representation type, the sys-
tem following the decision-making module runs different
pipelines to execute these maneuvers. Corridor-based ma-
neuvers are passed to a trajectory planner, e.g. [19] or [20],
Figure 1: Maneuver corridor for a lane change, right bound in followed by an appropriate controller. While trajectory-based
green, left bound in red, reference line in blue. The planned driving commands are directly handed over to a trajectory
trajectory as circles, one circle per time step. controller that is tuned for slow velocities and is capable of
backward driving, as needed for maneuvers like parking.
A. Environment Model
C. Driving Maneuvers — How to drive
The environment model in our implementation contains
a lanelet map [16], planned route, ego motion state and Following the behavior-based approach we begin with
detected objects with prediction. The map describes drivable designing atomic behavior blocks for simple tasks, before
areas, distinct lanes, parking lots, traffic rules, etc. The route stacking them together in section III-D. Here, we do not
is provided by a routing module. The ego motion state mainly attempt to present a feature-complete list with all necessary
depicts the current pose and velocity of the ego vehicle. behaviors. Instead, we focus on explaining the main design
Currently, we assume that the objects are given with a concept using some hand-picked example behaviors, that
decoupled prediction. A generic decision-making framework should compile a decent start to develop an AV. This stack
should support both open-loop and closed-loop prediction can then be extended iteratively by more specialized behavior
though. Therefore, integrated planning and prediction within blocks addressing specific driving situations. Furthermore, a
the behavior blocks is also possible in our approach. behavior block can compute its maneuver command with
any preferred state-of-the-art method. For better clarity and
B. Maneuver Representation conciseness, the behavior blocks used in our evaluation are
explained in detail while remaining behaviors will only be
As we aim for a generalizing approach that is applicable
described briefly.
to various driving environments our maneuver representation
should be as task-agnostic as possible. It should fit all relevant An urban environment is probably the most challenging
use cases and environments of automated driving, namely one for automated driving. We can think of at least three
highway, rural, urban and parking. However, the proposed basic driving maneuvers needed in an urban setting:
representation and interfaces would also work for other FollowEgoLane As long as the ego pose is within any urban
environments like off-road driving. lane of our route our vehicle could follow it in ACC.
Our behavior blocks represent basic driving maneuvers That is – without traffic – also the case for intersections,
such as “follow the ego lane”, “merge into traffic” or “park so we ignore these at this point. Later on, a special
near goal”. In general, we can distinguish between maneuvers higher priority intersection behavior will take care of
in a structured or unstructured environment. Urban and traffic rules and all the other challenges of intersections.
highway scenarios provide road boundaries or even distinct invocation condition True, as long as the ego pose
lanes, while parking lots and off-road areas feature open matches a lanelet along our route.
space like scenarios. commitment condition Same as the invocation condi-
Therefore, we use a twofold maneuver representation: tion, but as executing this behavior will keep the
Driving commands in structured environments use a corridor- vehicle in its lane the commitment condition should
based maneuver representation. It consists of a maneuver always evaluate to true.
corridor, reference line, predicted objects and the chosen command A maneuver corridor is constructed from
maneuver variant. The corridor is usually generated from consecutive lanelets along our route, starting at the
map data [16], but could also be provided online, e.g. from ego lanelet. In case a lane change is necessary to
semantic segmentation [17]. The reference line is an approx- follow the route, the FollowEgoLane corridor will end
imation of the centerline and can serve as a rough positional at the last lanelet where such a lane change would be
reference. Additionally, velocity objectives are given along possible, as shown in Fig. 6. Leading vehicles along
this line, e.g. derived from the speed limit and curvature. The this corridor (also considering predictions) are flagged
object list contains all objects relevant for this maneuver, their as ACC objects in the maneuver variant.
Figure 2: Full arbitration graph of the proposed minimal behavior set for automated driving. Basic behavior blocks are drawn
with round corners, arbitrators have sharp corners. The vertical ordering of behaviors depicts their priority or sequence in
case of priority or sequence arbitrators. Icons by Font Awesome – CC BY 4.0 License.
ChangeLane Lane changes, on the other hand, are only ridor is constructed along our route, but also contains
possible when the current ego lane has a directly ad- directly adjacent reachable lanelets, as shown in Fig. 1
jacent reachable lane on the left or right side with a and 6. The ego lane within this corridor is cut after
LaneChange
safe distance to the following and leading vehicles. The d max to enforce a lane change within this dis-
ChangeLane component is defined w.r.t. the supposed tance. The maneuver variant contains properly flagged
changing direction and instantiated once for each direc- leading objects in the start and target lane, as well as
tion to improve reusability. following vehicles in the target lane.
invocation condition True as long as the current ego CrossIntersection One characteristic of urban environments
lanelet has a directly adjacent reachable lanelet in are numerous signalized or unsignalized intersections
the respective direction with a big enough gap to that need specific behavior. An AV has to yield to
safely change into: The closest leading and follow- super-ordinate traffic participants and take special care
ing objects in the target lane should have a lon- of vulnerable road users (VRUs) and occlusions [21].
gitudinal spatial and temporal distance greater than In dense traffic it might be necessary to perform lane
ahead behind ahead behind
d min , d min , T T C min and T T C min respectively. changes in three consecutive phases [22]. These can be
commitment condition In order to produce consistent designed as behavior blocks as well and put into sequence
driving behavior, the commitment condition is true in section III-D:
until the lane change maneuver has been completed ApproachGap The most promising gap will be approached
or properly aborted. The lane change is successfully laterally by de- or accelerating.
completed as soon as the full ego shape is within IndicateIntention Once the gap has been reached the vehi-
the target lane. In case the selected gap becomes too cle will indicate its intention using the turn signals.
small, the lane change is aborted with commitment MergeIntoGap As soon as the gap size is big enough, the
condition true until the ego shape is fully within the vehicle can safely merge into it.
starting lane again.
Another typical application for AVs is driving on high-
command Similar to FollowEgoLane a maneuver cor-
ways. Many occurring behaviors are similar to those provided
for urban environments. High velocities and special traffic In an urban environment possible behaviors are Follow-
rules justify distinct highway behavior blocks though. EgoLane, ChangeLane, M ERGE I NTO L ANE and C ROSS I N -
MergeOntoHighway High relative velocities and sometimes TERSECTION . In order to clear intersections as soon as
short onramps pose a challenge when entering highways. reasonably possible and not to change lanes unintentionally
Thus, MergeOntoHighway could also be modeled with in an intersection, C ROSS I NTERSECTION has clear priority at
sequential sub-behaviors, to decompose the problem. intersections. The remaining urban behaviors typically have
FollowHighwayLane The typical ACC behavior that can no clear and consistent priority over each other though —
already be found in some of the modern series cars. yet the most reasonable one should be chosen.
ChangeHighwayLane Changing lanes on highways can be As none of the existing arbitration schemes (by priority,
modeled as a multi-phase behavior or as one integrated sequence or random) are sufficient for this task, we define a
interaction aware behavior, using e.g. POMDPs [23]. new cost-based arbitrator that selects the behavior option with
ExitFromHighway Exiting from highways can be as simple the lowest expected cost. A hysteresis prevents oscillating be-
as changing to a new diverging lane or as challenging as havior choices. By introducing cost arbitrators, the decision-
crossing traffic that is meanwhile entering the highway. making concept can be extended to dynamically changing
In the beginning, end or even during an automated drive, preferences.
the vehicle has to park in a suitable place. Usually, path or However, cost arbitrators should be used with care. First
trajectory planners based on graph search methods are used of all, the cost estimates of an arbitrators behavior options
in unstructured environments like parking lots [24]. have to be comparable. This could easily lead to cross-
LeaveGarage When starting a ride, LeaveGarage brings the dependencies of behavior blocks. Secondly, if the cost con-
AV from the garage onto the track. tains too many obfuscated objectives, the selection process
ParkNearGoal As soon as the AV is close to its goal and becomes difficult to understand. Both are properties we
a suitable parking lot is found, the vehicle can reduce actually want to avoid. Therefore, we advise to use cost
its speed and park into this parking lot. Notice, that the arbitrators rarely and with simple, generic costs. In our case,
search for a parking lot is not included here. It might be we use a simple estimate of the expected travel velocity:
modeled as another behavior block or supplied by the
routing module. OU RBAN D RIVING = f {FollowEgoLane,
invocation condition True if the AV is near standstill ChangeLaneLeft / -Right,
parking
(vego < v max ), the parking lot closer than r max
parking
M ERGE I NTO L ANE L EFT / -R IGHT}
freespace
and no dynamic objects within r min .
commitment condition True until the parking position As discussed in section III-C lane changes in dense traffic
parking
is reached with r min precision. An arbitrator can can be decomposed into three stages. As a result, a sequence-
use this information to prevent other behaviors from based arbitrator is used to compose M ERGE I NTO L ANE:
taking over during a tight parking maneuver.
command A Hybrid Curvature trajectory is generated OM ERGE I NTO L ANE = (ApproachGap, IndicateIntention,
based on [24], assuming a static environment. MergeIntoGap)
Finally, we add fail-safe emergency behaviors, in case a
dangerous unforeseen traffic situation evolves or as a fall Highway behaviors are combined using a cost arbitrator:
back if no other behavior block is applicable.
EmergenyStop In case an unavoidable collision will be OH IGHWAY D RIVING = f {MergeOntoHighway,
anticipated, the EmergenyStop behavior will provide a FollowHighwayLane,
full-stop trajectory to reduce damage and fatalities. ChangeHighwayLaneLeft / -Right,
EvadeObject If a collision could be avoided laterally,
ExitFromHighway}
EvadeObject will provide an evasive maneuver like [25].
SafeStop As a fail-safe fallback for any system failure or
if no other behavior block provides feasible commands, In case of PARKING at most one option is feasible after
SafeStop will bring the vehicle to a safe stop. all, such that a trivial priority-based arbitrator can be used:
D. Arbitration Scheme — Which maneuver to drive OPARKING = (LeaveGarage, ParkNearGoal)

Now that we have developed a couple of basic behavior
blocks, we can use them to compose the overall behavior for The emergency maneuvers for unavoidable collisions are
automated driving, as shown in Fig. 2, starting bottom-up. grouped together using a cost-based arbitrator estimating the
We follow a similar notation to [14], denoting the behavior expected damage. In such a way, it chooses the option with
options of an arbitrator with OA RBITRATOR NAME , using round the lowest expected damage:
brackets “()” for an ordered list and curly brackets “{}” for
a set of options. Basic behavior blocks are highlighted with OAVOID C OLLISION I N L AST R ESORT = f {EmergenyStop, EvadeObject}
ItalicNames and arbitrators with C APITAL NAMES.
ParkNearGoal
ChangeLaneLeft
FollowEgoLane
ChangeLaneRight
SafeStop
0 100 200 300 400 500 600
Time [s]
Figure 3: Behavior choices in the experiment driving the whole test track.
A. Setup
The explanatory example performs basic urban driving
behaviors on a simulated 5.7 km test track based on our real-
world test route in Karlsruhe, Germany. The route, shown
in Fig. 4, contains segments with speed limits of 30 km h
,
km km
50 h and 60 h , is crossing or turning at 12 intersections,
traversing one roundabout and ends at a parking lot.
We use the ROS-based open-source simulation framework
CoInCar-Sim [15]. One great advantage of this framework
is that it provides the same interface as our test vehicle
Bertha [8]. Hence, we can develop, test and deploy the same
behavior and planning pipeline in CoInCar-Sim and Bertha.
Our basic example maneuvers for this track are: Park-
Figure 4: Test track running 5.7 km through Karlsruhe, NearGoal, FollowEgoLane, ChangeLane (one instantiation
Germany. Start and end position is a parking lot on the for left, another for right lane changes) and SafeStop. Lane
university campus. Tiles © 2020 Google, Map data © 2020 following and both lane change behaviors are combined
GeoBasis-DE/BKG. within a cost-based U RBAN D RIVING arbitrator. Whereas
parking, urban driving and the safe stop fallback constitute
the overall behavior using a priority-based AUTOMATED -
D RIVING arbitrator. Fig. 5 illustrates this arbitration graph.
This design has the following motivation. ParkNearGoal
is only applicable in the vicinity of the goal and a nearby
parking lot. Thus, as long as the ego vehicle is still on the
route FollowEgoLane is and ChangeLaneLeft or Change-
LaneRight might be applicable. U RBAN D RIVING will select
the most promising one, w.r.t. the expected average velocity,
Figure 5: Example arbitration graph, as used in our simulative routing costs and lane change penalties. As soon as the
experiments. Colors depict the state at point E: Grey: invoca- vehicle approaches its goal, FollowEgoLane will bring it to a
tion condition false, dark green: active behavior branch, light stop within the last lanelet. Then, ParkNearGoal will become
green: utility (normalized inverse costs, see also Fig. 6). applicable, chosen by priority and lead the car into its parking
lot. When the parking maneuver is finished, ParkNearGoal
will render inapplicable again. At that point also none of the
Finally, these arbitrators and the SafeStop fallback are U RBAN D RIVING behaviors are applicable any more because
composed together to the top-most priority-based arbitrator: the car has left the route. As a result AUTOMATED D RIVING
OAUTOMATED D RIVING = (AVOID C OLLISION I N L AST R ESORT, selects the lowest priority behavior SafeStop. This is a good
illustration of how the fallback behavior prevents undefined
PARKING, C ROSS I NTERSECTION,
states and keeps the vehicle in a safe position.
U RBAN D RIVING,
H IGHWAY D RIVING, SafeStop) B. Results
Fig. 3 shows the resulting behavior selection over time.
IV. E XPERIMENTS The whole route takes 9:40min and features the expected be-
In this section, we show the applicability of the proposed havior characteristics. The vehicle starts leaving the campus
concept to utilize a hierarchical behavior-based architecture area by following the lane. At intersection A, it changes to the
for behavior generation in automated driving. right lane in order to take a turn into a north-east direction.
At point B, it takes another right turn following the ego lane
and has to change to the left lane. When approaching the
next intersection C, the ego vehicle changes onto the exit
lane in order to turn into south-east direction. At t = 339 s FollowEgoLane: ChangeLaneRight:
it approaches and passes the roundabout D.
Fig. 6 shows the two applicable behavior options at point
E, where the route leads onto the “Adenauerring” again. The
route continues with a right turn from the rightmost lane,
while the ego is on the leftmost lane still. This is a suitable
scenario to explain the cost-based arbitration in detail. The
urban driving cost estimate incorporates the average expected
travel velocity, routing costs and penalizes lane changes:
J = −v̂ + nLCNeeded ⋅ JLCNeeded , without lane change
J = −v̂ + nLCNeeded ⋅ JLCNeeded + JLCManeuver , otherwise Figure 6: FollowEgoLane and ChangeLaneRight maneuver
As a simple, yet effective heuristic, we estimate v̂, the ex- corridors at point E. The route continues to the right at this
pected average velocity of this maneuver, from the maneuver point. As a result, the FollowEgoLane corridor ends in 74 m,
corridor length and speed limit as shown in Fig. 6. For while the ChangeLaneRight corridor has a length of 243 m.
routing, we charge each lane change needed to follow the
route after this command with JLCNeeded = 10 km h
. Lane
change behaviors themselves are penalized with a lower It consists of maneuvers for urban and highway environ-
JLCManeuver = 5 km . Hence, the arbitrator generally prefers ments, contains parking and emergency behaviors, and pre-
h vents undefined states with a fallback safe stop behavior.
the follow lane behavior as long as it matches the route. As
soon as one or multiple lane changes will be necessary, this We have shown the usefulness and applicability of our
maneuver will become more favorable. design in an explanatory evaluation on a simulated route.
At point E, the behaviors have these costs: The key advantages of the approach are:
• Scenario-specific solutions can be combined easily.
JFollowEgoLane = −25.0 + 1 ⋅ 10.0 = −15.0
In the experiments, five different behaviors have been
JChangeLaneRight = −33.4 + 0 ⋅ 10.0 + 5.0 = −28.4 employed to handle various scenarios, from four-way
Consequently the cost-based arbitrator chooses Change- intersections, T-junctions, a roundabout to multi-lane
LaneRight, which has lower cost than FollowEgoLane, as bypass roads and parking.
also illustrated in Fig. 5. • It supports different planning approaches.
An interesting part is directly after taking the right turn We utilized two different trajectory planners in our
at point E from t = 422 s to t = 436 s. Here, the vehicle experiments. Urban corridor-based maneuvers used an
performs two consecutive lane changes in order to pass this optimization-based planner similar to [19], while the
two-lane road from the rightmost lane to the exit lane. This parking maneuver generated Hybrid Curvature trajec-
is especially noteworthy, as no double lane change or other tories with an RRT* motion planner [24]. But also dif-
hand-crafted behavior has been defined for such a scenario. ferent approaches could be used for the same behavior.
The behavior emerges purely because the routing has been • The resulting behavior can be well explained.
incorporated into the cost estimate. The strongly modular design significantly improves un-
The road leads back to the campus again, where the vehicle derstandability compared to FSMs or classical behavior-
slows down and stops at the end of the route. Finally, the based systems. Each invocation condition can be well
parking behavior becomes active and brings the car into its understood; the selection logic of arbitrators is compre-
parking lot. After finishing the parking maneuver, the safe hensive. As a result, the hierarchical decision-making
stop behavior is the last suitable option and keeps the car at process can be well explained and traced over time.
a standstill. • It can be iteratively extended by more behaviors.
Please also consider our video: youtu.be/qdIwchDGA_g In order to add the parking behavior to our behavior gen-
eration, the definition of its invocation and commitment
V. C ONCLUSIONS AND F UTURE W ORK conditions was sufficient to add it to the AUTOMATED -
This publication presented the following contributions: D RIVING arbitrator. Thanks to the strong decoupling,
An extension to the hierarchical behavior-based arbitration no changes to any other behavior block were necessary.
concept proposed in [14]. We introduced a cost-based arbi- • The modularity supports robustness and efficiency.
tration scheme that is helpful when multiple behavior options Each of the behavior blocks is self-contained, such that
are applicable but have no clear and consistent priority among occurring failures are contained as well and do not affect
each other. the overall system stability. In case of a failure, the sys-
We have formulated a behavior generation stack for AVs tem will degrade seamlessly by ignoring this behavior
based on the hierarchical behavior-based arbitration scheme. option. Furthermore, the atomic structure allows to eval-
uate behavior options in parallel to increase efficiency. [11] R. Brooks, “A robust layered control system for a
Strong modularity has many more advantages, among mobile robot,” IEEE J. on Robot. and Automation,
others, reusability and maintainability. vol. 2, no. 1, Mar. 1986.
• Complex behavior emerges from simple components. [12] Julio K. Rosenblatt, “DAMN: A distributed architec-
Complex system behavior, as multiple consecutive lane ture for mobile navigation,” J. of Exp. & Theor. Artif.
changes to approach an exit lane, emerges from the Intell., vol. 9, no. 2-3, 1997.
arbitration scheme without the need for hand-crafted [13] Pattie Maes, “How to do the right thing,” Connection
decision or planning logic. Science, vol. 1, no. 3, 1989.
These benefits have led to a smooth development process [14] M. Lauer, R. Hafner, S. Lange, and M. Riedmiller,
with promising results, as outlined in section IV. Thus, “Cognitive concepts in autonomous soccer playing
we look forward to further enhance the numerous existing robots,” Cogn. Syst. Res., vol. 11, no. 3, 2010.
behavior blocks, extend the behavior stack by e.g. our MIQP [15] M. Naumann, F. Poggenhans, M. Lauer, and C. Stiller,
approach for cooperative zip merges [26] and most excitingly “CoInCar-Sim: An Open-Source Simulation Frame-
to integrate this stack on our test vehicle Bertha. work for Cooperatively Interacting Automobiles,” in
IEEE Intell. Veh. Symp., Jun. 2018.
R EFERENCES [16] F. Poggenhans, J.-H. Pauls, J. Janosovits, S. Orf,
[1] S. Hoermann, F. Kunz, D. Nuss, S. Reuter, and K. M. Naumann, F. Kuhnt, et al., “Lanelet2: A high-
Dietmayer, “Entering crossroads with blind corners. A definition map framework for the future of automated
safe strategy for autonomous vehicles,” in IEEE Intell. driving,” in Int. Conf. on Intell. Transp. Syst., 2018.
Veh. Symp., Jun. 2017. [17] A. Meyer, N. O. Salscheider, P. F. Orzechowski, and
[2] C. Hubmann, J. Schulz, M. Becker, D. Althoff, and C. Stiller, “Deep Semantic Lane Segmentation for
C. Stiller, “Automated Driving in Uncertain Envi- Mapless Driving,” in IEEE/RSJ Int. Conf. on Intell.
ronments: Planning With Interaction and Uncertain Robots and Syst., Oct. 2018.
Maneuver Prediction,” IEEE Trans. Intell. Veh., vol. 3, [18] P. Bender, Ö. Ş. Taş, J. Ziegler, and C. Stiller, “The
no. 1, Mar. 2018. combinatorial aspect of motion planning: Maneuver
[3] M. Bouton, A. Nakhaei, K. Fujimura, and M. J. variants in structured environments,” in IEEE Intell.
Kochenderfer, “Scalable Decision Making with Sensor Veh. Symp., IEEE, 2015.
Occlusions for Autonomous Driving,” in IEEE Int. [19] J. Ziegler, P. Bender, T. Dang, and C. Stiller, “Tra-
Conf. on Robot. and Automation, May 2018. jectory planning for Bertha – A local, continuous
[4] M. Naumann, H. Königshof, and C. Stiller, “Provably method,” in IEEE Intell. Veh. Symp., Jun. 2014.
Safe and Smooth Lane Changes in Mixed Trafic,” in [20] B. Gutjahr, L. Gröll, and M. Werling, “Lateral Vehi-
IEEE Intell. Transp. Syst. Conf., Oct. 2019. cle Trajectory Optimization Using Constrained Linear
[5] M. Buehler, K. Iagnemma, and S. Singh, Eds., The Time-Varying MPC,” IEEE Trans. Intell. Transp. Syst.,
DARPA Urban Challenge: Autonomous Vehicles in vol. 18, no. 6, Jun. 2017.
City Traffic, red. by B. Siciliano, O. Khatib, and F. [21] P. F. Orzechowski, A. Meyer, and M. Lauer, “Tackling
Groen, vol. 56, Springer Tracts in Advanced Robot. Occlusions & Limited Sensor Range with Set-based
Berlin, Heidelberg: Springer, 2009. Safety Verification,” in Int. Conf. on Intell. Transp.
[6] A. Bacha, C. Bauman, R. Faruque, M. Fleming, C. Ter- Syst., IEEE, Nov. 2018.
welp, C. Reinholtz, et al., “Odin: Team victortango’s [22] J. Nilsson, M. Brännström, E. Coelingh, and J.
entry in the darpa urban challenge,” J. of Field Robot., Fredriksson, “Lane Change Maneuvers for Automated
vol. 25, no. 8, 2008. Vehicles,” IEEE Trans. Intell. Transp. Syst., vol. PP,
[7] M. Montemerlo, J. Becker, S. Bhat, H. Dahlkamp, D. no. 99, 2016.
Dolgov, S. Ettinger, et al., “Junior: The Stanford entry [23] C. Hubmann, J. Schulz, G. Xu, D. Althoff, and C.
in the Urban Challenge,” J. of Field Robot., vol. 25, Stiller, “A Belief State Planner for Interactive Merge
no. 9, 2008. Maneuvers in Congested Traffic,” in Int. Conf. on
[8] J. Ziegler, P. Bender, M. Schreiber, H. Lategahn, T. Intell. Transp. Syst., IEEE, Nov. 2018.
Strauss, C. Stiller, et al., “Making Bertha Drive – [24] H. Banzhaf, M. Dolgov, J. Stellet, and J. M. Zöllner,
An Autonomous Journey on a Historic Route,” IEEE “From Footprints to Beliefprints: Motion Planning un-
Intell. Transp. Syst. Mag., vol. 6, no. 2, Sum. 2014. der Uncertainty for Maneuvering Automated Vehicles
[9] M. Aeberhard, S. Rauch, M. Bahram, G. Tanzmeister, in Dense Scenarios,” in Int. Conf. on Intell. Transp.
J. Thomas, Y. Pilat, et al., “Experience, Results and Syst., IEEE, Nov. 2018.
Lessons Learned from Automated Driving on Ger- [25] M. Werling and D. Liccardo, “Automatic collision
many’s Highways,” IEEE Intell. Transp. Syst. Mag., avoidance using model-predictive online optimiza-
vol. 7, no. 1, Spr. 2015. tion,” in IEEE Conf. on Decision and Control, 2012.
[10] B. Siciliano and O. Khatib, Eds., Springer Handbook [26] C. Burger and M. Lauer, “Cooperative Multiple Vehi-
of Robotics, Springer Handbooks, Springer Interna- cle Trajectory Planning using MIQP,” in Int. Conf. on
tional Publishing, 2016. Intell. Transp. Syst., IEEE, Nov. 2018.

Decision-Making For Automated Vehicles Using A Hierarchical Behavior-Based Arbitration Scheme

Uploaded by

Copyright:

Available Formats

Decision-Making For Automated Vehicles Using A Hierarchical Behavior-Based Arbitration Scheme

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decision-Making For Automated Vehicles Using A Hierarchical Behavior-Based Arbitration Scheme

Uploaded by

Copyright:

Available Formats

Decision-Making for Automated Vehicles Using a

Hierarchical Behavior-Based Arbitration Scheme

are implementing tactical and strategical behavior generation

D. Arbitration Scheme — Which maneuver to drive OPARKING = (LeaveGarage, ParkNearGoal)

You might also like