Decision-Making For Automated Vehicles Using A Hierarchical Behavior-Based Arbitration Scheme
Decision-Making For Automated Vehicles Using A Hierarchical Behavior-Based Arbitration Scheme
Decision-Making For Automated Vehicles Using A Hierarchical Behavior-Based Arbitration Scheme
Abstract—Behavior planning and decision-making are some of such scenario- and methodology-specific approaches into
of the biggest challenges for highly automated systems. A one understandable and traceable architecture.
fully automated vehicle (AV) is faced with numerous tactical How and when should an AV switch from a regular ACC
and strategical choices. Most state-of-the-art AV platforms
controller to a lane change, cooperative zip merge or parking
arXiv:2003.01149v4 [cs.RO] 5 Feb 2021
ChangeLane Lane changes, on the other hand, are only ridor is constructed along our route, but also contains
possible when the current ego lane has a directly ad- directly adjacent reachable lanelets, as shown in Fig. 1
jacent reachable lane on the left or right side with a and 6. The ego lane within this corridor is cut after
LaneChange
safe distance to the following and leading vehicles. The d max to enforce a lane change within this dis-
ChangeLane component is defined w.r.t. the supposed tance. The maneuver variant contains properly flagged
changing direction and instantiated once for each direc- leading objects in the start and target lane, as well as
tion to improve reusability. following vehicles in the target lane.
invocation condition True as long as the current ego CrossIntersection One characteristic of urban environments
lanelet has a directly adjacent reachable lanelet in are numerous signalized or unsignalized intersections
the respective direction with a big enough gap to that need specific behavior. An AV has to yield to
safely change into: The closest leading and follow- super-ordinate traffic participants and take special care
ing objects in the target lane should have a lon- of vulnerable road users (VRUs) and occlusions [21].
gitudinal spatial and temporal distance greater than In dense traffic it might be necessary to perform lane
ahead behind ahead behind
d min , d min , T T C min and T T C min respectively. changes in three consecutive phases [22]. These can be
commitment condition In order to produce consistent designed as behavior blocks as well and put into sequence
driving behavior, the commitment condition is true in section III-D:
until the lane change maneuver has been completed ApproachGap The most promising gap will be approached
or properly aborted. The lane change is successfully laterally by de- or accelerating.
completed as soon as the full ego shape is within IndicateIntention Once the gap has been reached the vehi-
the target lane. In case the selected gap becomes too cle will indicate its intention using the turn signals.
small, the lane change is aborted with commitment MergeIntoGap As soon as the gap size is big enough, the
condition true until the ego shape is fully within the vehicle can safely merge into it.
starting lane again.
Another typical application for AVs is driving on high-
command Similar to FollowEgoLane a maneuver cor-
ways. Many occurring behaviors are similar to those provided
for urban environments. High velocities and special traffic In an urban environment possible behaviors are Follow-
rules justify distinct highway behavior blocks though. EgoLane, ChangeLane, M ERGE I NTO L ANE and C ROSS I N -
MergeOntoHighway High relative velocities and sometimes TERSECTION . In order to clear intersections as soon as
short onramps pose a challenge when entering highways. reasonably possible and not to change lanes unintentionally
Thus, MergeOntoHighway could also be modeled with in an intersection, C ROSS I NTERSECTION has clear priority at
sequential sub-behaviors, to decompose the problem. intersections. The remaining urban behaviors typically have
FollowHighwayLane The typical ACC behavior that can no clear and consistent priority over each other though —
already be found in some of the modern series cars. yet the most reasonable one should be chosen.
ChangeHighwayLane Changing lanes on highways can be As none of the existing arbitration schemes (by priority,
modeled as a multi-phase behavior or as one integrated sequence or random) are sufficient for this task, we define a
interaction aware behavior, using e.g. POMDPs [23]. new cost-based arbitrator that selects the behavior option with
ExitFromHighway Exiting from highways can be as simple the lowest expected cost. A hysteresis prevents oscillating be-
as changing to a new diverging lane or as challenging as havior choices. By introducing cost arbitrators, the decision-
crossing traffic that is meanwhile entering the highway. making concept can be extended to dynamically changing
In the beginning, end or even during an automated drive, preferences.
the vehicle has to park in a suitable place. Usually, path or However, cost arbitrators should be used with care. First
trajectory planners based on graph search methods are used of all, the cost estimates of an arbitrators behavior options
in unstructured environments like parking lots [24]. have to be comparable. This could easily lead to cross-
LeaveGarage When starting a ride, LeaveGarage brings the dependencies of behavior blocks. Secondly, if the cost con-
AV from the garage onto the track. tains too many obfuscated objectives, the selection process
ParkNearGoal As soon as the AV is close to its goal and becomes difficult to understand. Both are properties we
a suitable parking lot is found, the vehicle can reduce actually want to avoid. Therefore, we advise to use cost
its speed and park into this parking lot. Notice, that the arbitrators rarely and with simple, generic costs. In our case,
search for a parking lot is not included here. It might be we use a simple estimate of the expected travel velocity:
modeled as another behavior block or supplied by the
routing module. OU RBAN D RIVING = f {FollowEgoLane,
invocation condition True if the AV is near standstill ChangeLaneLeft / -Right,
parking
(vego < v max ), the parking lot closer than r max
parking
M ERGE I NTO L ANE L EFT / -R IGHT}
freespace
and no dynamic objects within r min .
commitment condition True until the parking position As discussed in section III-C lane changes in dense traffic
parking
is reached with r min precision. An arbitrator can can be decomposed into three stages. As a result, a sequence-
use this information to prevent other behaviors from based arbitrator is used to compose M ERGE I NTO L ANE:
taking over during a tight parking maneuver.
command A Hybrid Curvature trajectory is generated OM ERGE I NTO L ANE = (ApproachGap, IndicateIntention,
based on [24], assuming a static environment. MergeIntoGap)
Finally, we add fail-safe emergency behaviors, in case a
dangerous unforeseen traffic situation evolves or as a fall Highway behaviors are combined using a cost arbitrator:
back if no other behavior block is applicable.
EmergenyStop In case an unavoidable collision will be OH IGHWAY D RIVING = f {MergeOntoHighway,
anticipated, the EmergenyStop behavior will provide a FollowHighwayLane,
full-stop trajectory to reduce damage and fatalities. ChangeHighwayLaneLeft / -Right,
EvadeObject If a collision could be avoided laterally,
ExitFromHighway}
EvadeObject will provide an evasive maneuver like [25].
SafeStop As a fail-safe fallback for any system failure or
if no other behavior block provides feasible commands, In case of PARKING at most one option is feasible after
SafeStop will bring the vehicle to a safe stop. all, such that a trivial priority-based arbitrator can be used:
A. Setup
The explanatory example performs basic urban driving
behaviors on a simulated 5.7 km test track based on our real-
world test route in Karlsruhe, Germany. The route, shown
in Fig. 4, contains segments with speed limits of 30 km h
,
km km
50 h and 60 h , is crossing or turning at 12 intersections,
traversing one roundabout and ends at a parking lot.
We use the ROS-based open-source simulation framework
CoInCar-Sim [15]. One great advantage of this framework
is that it provides the same interface as our test vehicle
Bertha [8]. Hence, we can develop, test and deploy the same
behavior and planning pipeline in CoInCar-Sim and Bertha.
Our basic example maneuvers for this track are: Park-
Figure 4: Test track running 5.7 km through Karlsruhe, NearGoal, FollowEgoLane, ChangeLane (one instantiation
Germany. Start and end position is a parking lot on the for left, another for right lane changes) and SafeStop. Lane
university campus. Tiles © 2020 Google, Map data © 2020 following and both lane change behaviors are combined
GeoBasis-DE/BKG. within a cost-based U RBAN D RIVING arbitrator. Whereas
parking, urban driving and the safe stop fallback constitute
the overall behavior using a priority-based AUTOMATED -
D RIVING arbitrator. Fig. 5 illustrates this arbitration graph.
This design has the following motivation. ParkNearGoal
is only applicable in the vicinity of the goal and a nearby
parking lot. Thus, as long as the ego vehicle is still on the
route FollowEgoLane is and ChangeLaneLeft or Change-
LaneRight might be applicable. U RBAN D RIVING will select
the most promising one, w.r.t. the expected average velocity,
Figure 5: Example arbitration graph, as used in our simulative routing costs and lane change penalties. As soon as the
experiments. Colors depict the state at point E: Grey: invoca- vehicle approaches its goal, FollowEgoLane will bring it to a
tion condition false, dark green: active behavior branch, light stop within the last lanelet. Then, ParkNearGoal will become
green: utility (normalized inverse costs, see also Fig. 6). applicable, chosen by priority and lead the car into its parking
lot. When the parking maneuver is finished, ParkNearGoal
will render inapplicable again. At that point also none of the
Finally, these arbitrators and the SafeStop fallback are U RBAN D RIVING behaviors are applicable any more because
composed together to the top-most priority-based arbitrator: the car has left the route. As a result AUTOMATED D RIVING
OAUTOMATED D RIVING =
(AVOID C OLLISION I N L AST R ESORT, selects the lowest priority behavior SafeStop. This is a good
illustration of how the fallback behavior prevents undefined
PARKING, C ROSS I NTERSECTION,
states and keeps the vehicle in a safe position.
U RBAN D RIVING,
H IGHWAY D RIVING, SafeStop) B. Results
Fig. 3 shows the resulting behavior selection over time.
IV. E XPERIMENTS The whole route takes 9:40min and features the expected be-
In this section, we show the applicability of the proposed havior characteristics. The vehicle starts leaving the campus
concept to utilize a hierarchical behavior-based architecture area by following the lane. At intersection A, it changes to the
for behavior generation in automated driving. right lane in order to take a turn into a north-east direction.
At point B, it takes another right turn following the ego lane
and has to change to the left lane. When approaching the
next intersection C, the ego vehicle changes onto the exit
lane in order to turn into south-east direction. At t = 339 s FollowEgoLane: ChangeLaneRight:
it approaches and passes the roundabout D.
Fig. 6 shows the two applicable behavior options at point
E, where the route leads onto the “Adenauerring” again. The
route continues with a right turn from the rightmost lane,
while the ego is on the leftmost lane still. This is a suitable
scenario to explain the cost-based arbitration in detail. The
urban driving cost estimate incorporates the average expected
travel velocity, routing costs and penalizes lane changes:
J = −v̂ + nLCNeeded ⋅ JLCNeeded , without lane change
J = −v̂ + nLCNeeded ⋅ JLCNeeded + JLCManeuver , otherwise Figure 6: FollowEgoLane and ChangeLaneRight maneuver
As a simple, yet effective heuristic, we estimate v̂, the ex- corridors at point E. The route continues to the right at this
pected average velocity of this maneuver, from the maneuver point. As a result, the FollowEgoLane corridor ends in 74 m,
corridor length and speed limit as shown in Fig. 6. For while the ChangeLaneRight corridor has a length of 243 m.
routing, we charge each lane change needed to follow the
route after this command with JLCNeeded = 10 km h
. Lane
change behaviors themselves are penalized with a lower It consists of maneuvers for urban and highway environ-
JLCManeuver = 5 km . Hence, the arbitrator generally prefers ments, contains parking and emergency behaviors, and pre-
h vents undefined states with a fallback safe stop behavior.
the follow lane behavior as long as it matches the route. As
soon as one or multiple lane changes will be necessary, this We have shown the usefulness and applicability of our
maneuver will become more favorable. design in an explanatory evaluation on a simulated route.
At point E, the behaviors have these costs: The key advantages of the approach are:
• Scenario-specific solutions can be combined easily.
JFollowEgoLane = −25.0 + 1 ⋅ 10.0 = −15.0
In the experiments, five different behaviors have been
JChangeLaneRight = −33.4 + 0 ⋅ 10.0 + 5.0 = −28.4 employed to handle various scenarios, from four-way
Consequently the cost-based arbitrator chooses Change- intersections, T-junctions, a roundabout to multi-lane
LaneRight, which has lower cost than FollowEgoLane, as bypass roads and parking.
also illustrated in Fig. 5. • It supports different planning approaches.
An interesting part is directly after taking the right turn We utilized two different trajectory planners in our
at point E from t = 422 s to t = 436 s. Here, the vehicle experiments. Urban corridor-based maneuvers used an
performs two consecutive lane changes in order to pass this optimization-based planner similar to [19], while the
two-lane road from the rightmost lane to the exit lane. This parking maneuver generated Hybrid Curvature trajec-
is especially noteworthy, as no double lane change or other tories with an RRT* motion planner [24]. But also dif-
hand-crafted behavior has been defined for such a scenario. ferent approaches could be used for the same behavior.
The behavior emerges purely because the routing has been • The resulting behavior can be well explained.
incorporated into the cost estimate. The strongly modular design significantly improves un-
The road leads back to the campus again, where the vehicle derstandability compared to FSMs or classical behavior-
slows down and stops at the end of the route. Finally, the based systems. Each invocation condition can be well
parking behavior becomes active and brings the car into its understood; the selection logic of arbitrators is compre-
parking lot. After finishing the parking maneuver, the safe hensive. As a result, the hierarchical decision-making
stop behavior is the last suitable option and keeps the car at process can be well explained and traced over time.
a standstill. • It can be iteratively extended by more behaviors.
Please also consider our video: youtu.be/qdIwchDGA_g In order to add the parking behavior to our behavior gen-
eration, the definition of its invocation and commitment
V. C ONCLUSIONS AND F UTURE W ORK conditions was sufficient to add it to the AUTOMATED -
This publication presented the following contributions: D RIVING arbitrator. Thanks to the strong decoupling,
An extension to the hierarchical behavior-based arbitration no changes to any other behavior block were necessary.
concept proposed in [14]. We introduced a cost-based arbi- • The modularity supports robustness and efficiency.
tration scheme that is helpful when multiple behavior options Each of the behavior blocks is self-contained, such that
are applicable but have no clear and consistent priority among occurring failures are contained as well and do not affect
each other. the overall system stability. In case of a failure, the sys-
We have formulated a behavior generation stack for AVs tem will degrade seamlessly by ignoring this behavior
based on the hierarchical behavior-based arbitration scheme. option. Furthermore, the atomic structure allows to eval-
uate behavior options in parallel to increase efficiency. [11] R. Brooks, “A robust layered control system for a
Strong modularity has many more advantages, among mobile robot,” IEEE J. on Robot. and Automation,
others, reusability and maintainability. vol. 2, no. 1, Mar. 1986.
• Complex behavior emerges from simple components. [12] Julio K. Rosenblatt, “DAMN: A distributed architec-
Complex system behavior, as multiple consecutive lane ture for mobile navigation,” J. of Exp. & Theor. Artif.
changes to approach an exit lane, emerges from the Intell., vol. 9, no. 2-3, 1997.
arbitration scheme without the need for hand-crafted [13] Pattie Maes, “How to do the right thing,” Connection
decision or planning logic. Science, vol. 1, no. 3, 1989.
These benefits have led to a smooth development process [14] M. Lauer, R. Hafner, S. Lange, and M. Riedmiller,
with promising results, as outlined in section IV. Thus, “Cognitive concepts in autonomous soccer playing
we look forward to further enhance the numerous existing robots,” Cogn. Syst. Res., vol. 11, no. 3, 2010.
behavior blocks, extend the behavior stack by e.g. our MIQP [15] M. Naumann, F. Poggenhans, M. Lauer, and C. Stiller,
approach for cooperative zip merges [26] and most excitingly “CoInCar-Sim: An Open-Source Simulation Frame-
to integrate this stack on our test vehicle Bertha. work for Cooperatively Interacting Automobiles,” in
IEEE Intell. Veh. Symp., Jun. 2018.
R EFERENCES [16] F. Poggenhans, J.-H. Pauls, J. Janosovits, S. Orf,
[1] S. Hoermann, F. Kunz, D. Nuss, S. Reuter, and K. M. Naumann, F. Kuhnt, et al., “Lanelet2: A high-
Dietmayer, “Entering crossroads with blind corners. A definition map framework for the future of automated
safe strategy for autonomous vehicles,” in IEEE Intell. driving,” in Int. Conf. on Intell. Transp. Syst., 2018.
Veh. Symp., Jun. 2017. [17] A. Meyer, N. O. Salscheider, P. F. Orzechowski, and
[2] C. Hubmann, J. Schulz, M. Becker, D. Althoff, and C. Stiller, “Deep Semantic Lane Segmentation for
C. Stiller, “Automated Driving in Uncertain Envi- Mapless Driving,” in IEEE/RSJ Int. Conf. on Intell.
ronments: Planning With Interaction and Uncertain Robots and Syst., Oct. 2018.
Maneuver Prediction,” IEEE Trans. Intell. Veh., vol. 3, [18] P. Bender, Ö. Ş. Taş, J. Ziegler, and C. Stiller, “The
no. 1, Mar. 2018. combinatorial aspect of motion planning: Maneuver
[3] M. Bouton, A. Nakhaei, K. Fujimura, and M. J. variants in structured environments,” in IEEE Intell.
Kochenderfer, “Scalable Decision Making with Sensor Veh. Symp., IEEE, 2015.
Occlusions for Autonomous Driving,” in IEEE Int. [19] J. Ziegler, P. Bender, T. Dang, and C. Stiller, “Tra-
Conf. on Robot. and Automation, May 2018. jectory planning for Bertha – A local, continuous
[4] M. Naumann, H. Königshof, and C. Stiller, “Provably method,” in IEEE Intell. Veh. Symp., Jun. 2014.
Safe and Smooth Lane Changes in Mixed Trafic,” in [20] B. Gutjahr, L. Gröll, and M. Werling, “Lateral Vehi-
IEEE Intell. Transp. Syst. Conf., Oct. 2019. cle Trajectory Optimization Using Constrained Linear
[5] M. Buehler, K. Iagnemma, and S. Singh, Eds., The Time-Varying MPC,” IEEE Trans. Intell. Transp. Syst.,
DARPA Urban Challenge: Autonomous Vehicles in vol. 18, no. 6, Jun. 2017.
City Traffic, red. by B. Siciliano, O. Khatib, and F. [21] P. F. Orzechowski, A. Meyer, and M. Lauer, “Tackling
Groen, vol. 56, Springer Tracts in Advanced Robot. Occlusions & Limited Sensor Range with Set-based
Berlin, Heidelberg: Springer, 2009. Safety Verification,” in Int. Conf. on Intell. Transp.
[6] A. Bacha, C. Bauman, R. Faruque, M. Fleming, C. Ter- Syst., IEEE, Nov. 2018.
welp, C. Reinholtz, et al., “Odin: Team victortango’s [22] J. Nilsson, M. Brännström, E. Coelingh, and J.
entry in the darpa urban challenge,” J. of Field Robot., Fredriksson, “Lane Change Maneuvers for Automated
vol. 25, no. 8, 2008. Vehicles,” IEEE Trans. Intell. Transp. Syst., vol. PP,
[7] M. Montemerlo, J. Becker, S. Bhat, H. Dahlkamp, D. no. 99, 2016.
Dolgov, S. Ettinger, et al., “Junior: The Stanford entry [23] C. Hubmann, J. Schulz, G. Xu, D. Althoff, and C.
in the Urban Challenge,” J. of Field Robot., vol. 25, Stiller, “A Belief State Planner for Interactive Merge
no. 9, 2008. Maneuvers in Congested Traffic,” in Int. Conf. on
[8] J. Ziegler, P. Bender, M. Schreiber, H. Lategahn, T. Intell. Transp. Syst., IEEE, Nov. 2018.
Strauss, C. Stiller, et al., “Making Bertha Drive – [24] H. Banzhaf, M. Dolgov, J. Stellet, and J. M. Zöllner,
An Autonomous Journey on a Historic Route,” IEEE “From Footprints to Beliefprints: Motion Planning un-
Intell. Transp. Syst. Mag., vol. 6, no. 2, Sum. 2014. der Uncertainty for Maneuvering Automated Vehicles
[9] M. Aeberhard, S. Rauch, M. Bahram, G. Tanzmeister, in Dense Scenarios,” in Int. Conf. on Intell. Transp.
J. Thomas, Y. Pilat, et al., “Experience, Results and Syst., IEEE, Nov. 2018.
Lessons Learned from Automated Driving on Ger- [25] M. Werling and D. Liccardo, “Automatic collision
many’s Highways,” IEEE Intell. Transp. Syst. Mag., avoidance using model-predictive online optimiza-
vol. 7, no. 1, Spr. 2015. tion,” in IEEE Conf. on Decision and Control, 2012.
[10] B. Siciliano and O. Khatib, Eds., Springer Handbook [26] C. Burger and M. Lauer, “Cooperative Multiple Vehi-
of Robotics, Springer Handbooks, Springer Interna- cle Trajectory Planning using MIQP,” in Int. Conf. on
tional Publishing, 2016. Intell. Transp. Syst., IEEE, Nov. 2018.