Academia.eduAcademia.edu

Scenario space

2011, Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation

Navigation and steering in complex dynamically changing environments is a challenging research problem, and a fundamental aspect of immersive virtual worlds. While there exist a wide variety of approaches for navigation and steering, there is no definitive solution for evaluating and analyzing steering algorithms. Evaluating a steering algorithm involves two major challenges: (a) characterizing and generating the space of possible scenarios that the algorithm must solve, and (b) defining evaluation criteria (metrics) and applying them to the solution. In this paper, we address both of these challenges. First, we characterize and analyze the complete space of steering scenarios that an agent may encounter in dynamic situations. Then, we propose the representative scenario space and a sampling method that can generate subsets of the representative space with good statistical properties. We also propose a new set of metrics and a statistically robust approach to determining the coverage and the quality of a steering algorithm in this space. We demonstrate the effectiveness of our approach on three state of the art techniques. Our results show that these methods can only solve 60% of the scenarios in the representative scenario space.

Eurographics/ ACM SIGGRAPH Symposium on Computer Animation (2011) A. Bargteil and M. van de Panne (Editors) Scenario Space: Characterizing Coverage, Quality, and Failure of Steering Algorithms Mubbasir Kapadia1,2 , Matt Wang1 , Shawn Singh1,3 , 1 University Glenn Reinman1 , Petros Faloutsos1 of California Los Angeles of Pennsylvania 3 Google Inc. 2 University Abstract Navigation and steering in complex dynamically changing environments is a challenging research problem, and a fundamental aspect of immersive virtual worlds. While there exist a wide variety of approaches for navigation and steering, there is no definitive solution for evaluating and analyzing steering algorithms. Evaluating a steering algorithm involves two major challenges: (a) characterizing and generating the space of possible scenarios that the algorithm must solve, and (b) defining evaluation criteria (metrics) and applying them to the solution. In this paper, we address both of these challenges. First, we characterize and analyze the complete space of steering scenarios that an agent may encounter in dynamic situations. Then, we propose the representative scenario space and a sampling method that can generate subsets of the representative space with good statistical properties. We also propose a new set of metrics and a statistically robust approach to determining the coverage and the quality of a steering algorithm in this space. We demonstrate the effectiveness of our approach on three state of the art techniques. Our results show that these methods can only solve 60% of the scenarios in the representative scenario space. Categories and Subject Descriptors (according to ACM CCS): I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence—Multiagent Systems I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism— Animation I.6.6 [Simulation and Modeling]: —Simulation Output Analysis 1. Introduction Immersive virtual worlds have quickly come to the forefront in both industry and academia with their applicability being realized in a wide variety of areas from education, collaboration, urban design, and entertainment. A key aspect of immersion in virtual environments is the use of autonomous agents to inject life into these worlds. Autonomous agents require efficient, robust algorithms for navigation and steering in large, complex environments where the space of all possible situations an agent is likely to encounter is intractable. The rich set of scenarios and corresponding steering choices have resulted in a large variety of techniques that are focused on tackling a subset of this problem. To our knowledge, there exists no definitive measure of the ability c The Eurographics Association 2011. of a steering algorithm to successfully handle the space of all possible scenarios that it is likely to encounter in complex environments. This greatly limits future researchers and end-users in objectively evaluating and analyzing the current state of the art before choosing their own direction of exploration. There are two key requirements to doing a comprehensive evaluation of a steering technique. First, we must be able to sufficiently sample the representative set of challenging situations that an agent is likely to encounter. Next, we need a measure of scoring success for an algorithm for a particular scenario that has meaning on its own as well as in comparison with the scores for other approaches. Previous approaches have addressed these issues with Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space small sets of manually designed test cases, and ad hoc, scenario-dependent criteria. In this paper, we address both of these challenges with rigorous, statistically-based approaches. We examine the complete space of possible scenarios that a steering algorithm may need to solve given a set of user defined parameters, such as the size of the agents. After showing that an exhaustive sampling of this space is not practical, we propose the representative scenario and an associated sampling method. Both the representative scenario space and the sampling method are constrained to produce test sets that favor complexity, and avoid easy to solve cases. To evaluate a steering algorithm on a single scenario we propose a set of metrics that can be normalized with respect to ideal values so as to become scenario independent. Based on these metrics, we then propose the concepts of coverage, average quality and failure set and show how they can be computed over the representative scenario space. Computing these concepts over an entire scenario space provides a rigorous, statistical view of an algorithm, and can be used to evaluate a single approach or compare different approaches. In our opinion, our work is the first attempt to evaluate steering techniques in an automated and statistically sound fashion. This paper makes the following contributions: • We propose three concepts to statistically evaluate steering algorithms over a scenario space: coverage, average quality and failure set. • We define the space of all possible scenarios that an agent could encounter while steering and navigating in dynamically changing environments. In addition, we present a method of sufficiently sampling the representative scenarios in this space in order to effectively compute average quality and coverage for a particular steering algorithm. • We provide a method of automatically determining a failure set for an algorithm – a subset of scenarios where the algorithm performs poorly based on some criteria. This provides an invaluable tool for users and AI developers in evaluating their own steering techniques. • We demonstrate the effectiveness of our framework in analyzing four agent-based techniques: three state of the art [KSHF09, SKHF11, vdBLM08], and one simple baseline algorithm that only reacts to the most immediate threat. 2. Related Work There are three broad categories when it comes to the analysis and evaluation of crowd simulations: (1) comparing simulations to real world data, (2) performing user-studies to determine if the desired qualities of the simulation have been met and to manually detect the presence of anomalous behaviors, and (3) using statistical tools to analyze simulations. The real world and its real human characters are extremely complex, which makes it very difficult to compare a simulation to real events. Manual inspection of simulations is prone to human error and personal inclinations. Surveys [LS02, McF06] show that automated evaluation, especially for autonomous characters, is yet to be fully realized in the games industry. Hence, the focus of this work is in the use of computational methods and statistical tools to analyze, evaluate, and test crowd simulations. Section 2.1 reviews the traditional methods adopted in sampling the space of scenarios. Section 2.2 describes the metrics used for evaluation. Section 2.3 reviews some of the popular techniques used for steering. Section 2.4 describes our method in relation to prior work. 2.1. Benchmarks for Evaluation Steering approaches, outlined in Section 2.3, are generally targeted at specific subsets of human steering behaviors and use their own custom test cases for evaluation and demonstration. The work in [SKN∗ 09] proposes a standard suite of test cases that represent a large variety of steering behaviors and is independent of the algorithm used. In addition, [SKFR09] provides a suite of tools and helper functions to allow AI developers to quickly get started with their own algorithms. However, even the 42 test cases described here still cannot capture the large space of possible situations an agent will encounter in dynamic environments of realistic complexity. 2.2. Metrics for Evaluation Prior work has proposed a rich set of application-specific metrics to evaluate and analyze crowd simulations. The work of [PSAB08] uses presence as a metric for crowd evaluation. Number of collisions and effort are often used as metrics to minimize when developing steering algorithms [ST05, GCC∗ 10]. The work in [HFV00] uses “rate of people exiting a room” to analyze evacuation simulations. [LCSCO10] presents a data-driven approach for evaluating the behaviors of individuals within a simulated crowd. [RP07] describes a set of task-based metrics to evaluate the capability of a motion graph across a range of tasks and environments. The work in [SKN∗ 09, KSA∗ 09] proposes a rich set of derived metrics that provide an empirical measure of the performance of an algorithm. However, the values of these metrics (e.g. path length, total kinetic energy, total change in acceleration, etc.) are tightly coupled with the the length and complexity of a scenario, which prevents users from interpreting these metrics in a scenario-independent fashion. 2.3. Steering Approaches Since the seminal work of [Rey87, Rey99], there has been a growing interest in pedestrian simulation with a wide array of techniques being tested and implemented. A comprehensive overview of the related work in steering and navigation techniques can be found here [PAB08]. c The Eurographics Association 2011. Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space Centralized techniques [MRHA98, Lov94, Hen71] focus on the system as a whole, modeling the characteristics of the flow rather than individual pedestrians. Centralized approaches usually model a broader view of crowd behaviors as flows rather than focusing on individual specialized agent behaviors. De-centralized approaches model the agent as an independent entity that performs collision avoidance with static obstacles, reacts to dynamic threats in the environment, and steers its way to its target. Particle based approaches [Rey87, Rey99] model agents as particles and simulate crowds using basic particle dynamics. The social force model [HBJW05, BMOB03, BH97] solves Newton’s equations of motion to simulate forces such as repulsion, attraction, friction and dissipation for each agent to simulate pedestrians. Rule-based approaches [LD04, LMM03, Rey99, RMH05, PAB07, SGA∗ 07, vdBPS∗ 08] use various conditions and heuristics to identify the exact situation of an agent. Data-driven methods use existing video data or motion capture to derive steering choices that are then used in virtual worlds (e.g., [LCHL07, LCL07]). The works of [Feu00, PPD07] use predictions in the space-time domain to perform steering in environments populated with dynamic threats. Predicting potential threats ahead of time results in more realistic steering behaviors. We use three state of the art steering techniques to serve as the basis for the analysis results shown in this paper. In addition, we also evaluate a purely reactive approach to steering to demonstrate the efficacy of our framework over a variety of steering approaches. • Egocentric. The work in [KSHF09] proposes the use of egocentric affordance fields to model local variableresolution perception of agents in dynamic virtual environments. This method combines steering and local space-time planning to produce realistic steering behaviors in challenging local interactions as well as large scale scenarios involving thousands of agents. • PPR. The work in [SKHF11] presents a hybrid framework that combines reaction, prediction and planning into one single framework. • RVO. The work in [vdBLM08] proposes the use of reciprocal velocity obstacles to serve as a linear model of prediction for collision avoidance in crowds. • Reactive. This steering technique employs the use of a simple finite state machine of rules to govern the behavior of an autonomous agent in a crowd. This technique is purely reactive in nature and does not employ the use of any form of predictive collision avoidance. A description of the implementation of this technique can be found in [SKHF11]. 2.4. Comparison to Related Work Our work leverages was inspired by SteerBench [SKN∗ 09] and [RP07]. The work in [RP07] presented a method of calc The Eurographics Association 2011. culating coverage of motion graphs for a set of animation and navigation benchmarks. SteerBench proposed an objective set of test cases and an ad hoc, automatic method of scoring the performance of steering algorithms. The approximately 42 test cases provide a fixed and very sparse sampling of the scenario space. In this paper, we take a large step along this direction. First, we characterize the entire scenario space, and propose a sampling based approach to estimate, for the first time, the coverage of a steering algorithm. We also propose a new set of performance metrics and a robust statistical method for automatically analyzing the effectiveness of steering algorithms. 3. Scenario Space Like real people, virtual agents make their steering decisions by considering their surrounding environment and their goals. The environment usually consists of static obstacles and other agents. In this section we describe how we represent all the elements of a steering problem, which we refer to as a scenario. We define a scenario as one possible configuration of obstacles and agents in the environment. The configuration of an obstacle is its position in the environment along with the information of its bounding box (we assume rectangular obstacles). The configuration of an agent includes its initial position, target location, and desired speed. The configuration of agents and obstacles can be extended or modified to meet the needs of any application. The scenario space is defined as the space of all possible scenarios that an agent can encounter while steering in dynamic environments. The ratio of the subspace of scenarios that a steering algorithm can successfully handle is defined as the coverage of the algorithm. An ideal steering algorithm would be able to successfully handle all the scenarios in this extremely high dimensional space, thus having a coverage of 1. In order to be able to determine the coverage of a steering algorithm, we need the ability to sample the scenario space in a representative fashion and to objectively determine the performance of an algorithm for a particular scenario. Section 3.1 describes a set of user-defined parameters used to define a space of scenarios. In Section 3.2, we describe the results of our experiment to determine coverage of three steering algorithms in the complete space of scenarios. We observe that the value of coverage for each of these algorithms does not converge for even up to 10,000 sample points. Section 3.3 describes a set of constraints that are imposed on the complete scenario space to define the space of representative scenarios. We observe rapid convergence of coverage of steering algorithms in the representative scenario space. Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space (a) (b) Figure 1: The success rate of the four algorithms in the complete (a) and representative (b) scenario space vs the number of samples (size) of the test set. (a) (b) (c) (d) (e) (f) (g) (h) Figure 2: Figures (a)-(d) Scenarios randomly generated in the complete scenario space. A black line indicate an agent’s optimal path to the goal. Figures (e)-(h) Scenarios randomly generated in the representative scenario space. Our sampling process ensures that all agents interact with the reference agent (in blue) which is always placed in the center of the environment. 3.1. Parametrization of Scenario Space The space of all scenarios is determined by the number of obstacles and agents, the size of the environment, and the size of obstacles. A user may wish to test his steering algorithm on local interactions between agents in small environments with 2 or 3 agents. Alternatively, a user may wish to stress test his or her algorithm on large environments with a large distribution of agents and obstacles. We expose these parameters to the user to allow him to define the space of scenarios to meet the need of his application. The set of parameters, P is defined as follows: • Environment size. The size of the environment is defined as the radial distance, r, from the egocentric agent that is positioned at the center of the environment. • Obstacle Discretization. Obstacles are represented by a grid of rectangular blocks that are either on or off. The c The Eurographics Association 2011. Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space size of these blocks is determined by two parameters: resolution in X dx and resolution in Y dy . These values specify how many cells exist within the width and height of the environment as determined by the radial distance r defined above. • Number of agents. The number of agents in a scenario is governed by two user-defined parameters: the minimum and maximum number of agents (nmin , nmax ). • Target speed of agents. Some steering algorithms can specify a target speed for an agent. The range of possible values is determined by a minimum and maximum speed parameter (smin , smax ). Given a specific set of parameter values P that define a space of scenarios, we can procedurally or randomly sample scenarios with initial configurations of obstacles and agents that lie in that scenario space. 3.2. The Complete Scenario Space The complete scenario space, S(P) represents all the possible scenarios that can be generated for a particular set of user-defined parameters P. In order to prevent sampling of invalid scenarios that have no solution, we place certain validity constraints on the scenario space. • Collision-Free. The initial configurations of obstacles and agents must not be in a state of collision. • Solvable. There must exist a valid path taking an agent from its initial position to its target location. The space S(P) is infinite and cannot be sampled exhaustively. Instead, we aim to find a representative set of samples that describes this space sufficiently. To determine whether we can generate such a set, we first perform a random sampling experiment in S(P) where P = {r = 7, dx = dy = 10, nmin = 3, nmax = 6, smin = 1, smax = 2.7} . A scenario is randomly generated as follows: First, we generate the obstacles by randomly turning on or off cells in our obstacle grid. Next, we select a number of agents to simulate by randomly sampling the range defined above. For each agent, we choose a random obstacle-free position and orientation. We also choose a random obstacle-free position for each agent’s goal. All positions are chosen within the radius r and all orientations are sampled uniformly within [0, 2π). The performance of an algorithm for a scenario is evaluated as a boolean measure of whether or not it could complete the scenario. A scenario is said to be successfully completed if all agents reach the goal within a time threshold without any collisions. The coverage of an algorithm is the ratio of all scenarios that it could successfully complete. In this experiment we iteratively increase the number of sample points from N = 100 to 10, 000. The results are illustrated in Figure 1(a). We observe that the coverage of an c The Eurographics Association 2011. algorithm fluctuates between 0.9 and 0.95 and does not converge within reasonable bounds. Also, the minimum coverage of the three reference algorithms is quite high (> 0.9). Similarly, even the baseline reactive algorithm seems to perform well with a coverage of approximately 0.89. These observations suggest that the experiments contain many trivial or easy scenarios that greatly skew the computed measure of coverage, and affect its convergence. To get a better picture of the areas in the scenario space that algorithms may have trouble succeeding, we propose the Representative Scenario Space, and an egocentric evaluation method, which are described below. Algorithm PPR Egocentric RVO Reactive S(P) 0.919 0.915 0.931 0.887 R(P) 0.583 0.568 0.591 0.459 SteerBench 0.86(36/42) 0.86(36/42) 0.86(36/42) 0.83(35/42) Figure 3: The estimated coverage of the steering algorithms in the complete space S(P), the representative space R(P), and the 42 cases of SteerBench [SKN∗ 09]. 3.3. The Representative Scenario Space We eliminate trivial scenarios by applying the following constraints on the complete scenario space and the associated sampling method: • Reference Agent. The first agent is always placed at the origin of the environment and is known as the reference agent. The scenario is evaluated with respect to the reference agent. • Goals and Orientations. The goal of an agent is restricted to one of 8 choices that are located at the boundary of the scenario. The agent’s initial orientation is always pointing towards the agent’s goal. • Agent Spatial Positions. Instead of uniformly sampling the space for agent positions, we model the probability of a location in the environment ~x being sampled using ~ σ2 = 0.4). This implies a normal distribution N(~x,~µ = O, that agents are more likely to be placed closer to the origin, i.e. closer to the reference agent, which increases the likelihood of interaction between agents. • Agent Interactions. We place a constraint on the configuration of an agent placed in the scenario to ensure that it interacts with the reference agent. We compute an optimal path (using A*) for the agent from its start position to its goal. If the planned path of the agent intersects with the planned path of the reference agent in space and time (we assume constant speed of motion along the optimal path) then the agent is considered relevant and is placed in the scenario. • Agent Speeds. Instead of varying the desired speed of agents, we keep it a constant (1.7 m/s) as we observe that desired speed variations do not have a large impact on the resulting behavior of most steering approaches. Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space The resulting space of scenarios that meet these constraints is the representative scenario space, denoted by R(P). We change our evaluation method of a scenario to be with respect to the reference agent alone. Hence, an algorithm is successful on a scenario if the reference agent reaches its goal and there are no collisions with other agents. We run the same sampling experiment described above in the representative scenario space (Figure 1(b)). We observe convergence of coverage between N = 5, 000 to 10, 000. We also observe that the coverage of the algorithms is much lower. The three reference algorithms can only complete approximately half of the scenarios sampled. We also see a much larger difference in the coverage of the baseline reactive algorithm in comparison to the three reference algorithms, as one would expect. Figure 3 compares the coverage of algorithms in S(P), R(P), and using the test cases provided in SteerBench [SKN∗ 09]. The algorithms have very high coverage in both S(P) and SteerBench. The reactive algorithm fails in only one more scenario than the other three reference steering techniques in the 42 test cases that SteerBench provides. In contrast, the scenarios generated are much more challenging in R(P) which is reflected in low coverage values and a much larger difference between the baseline reactive technique and the three more sophisticated ones. In conclusion, we can make two important observations. First, the representative space sampled with our constrained sampling technique can produce test sets that expose the difficulties of steering algorithms. Second, approximately 10,000 samples seem to be enough for analyzing an algorithm, as indicated by the convergence of the coverage of the four algorithms. 4. Evaluation Criteria We evaluate a scenario by computing 3 primary metrics that quantify the success of the egocentric agent in completing the scenario. These metrics characterize whether or not the egocentric agent successfully reached its goal, the total time it took to reach its goal, and the total distance traveled in reaching the goal. By defining the metrics as a ratio to its optimal value, we can compare and evaluate these metrics on an absolute scale. • Scenario Completion. For an algorithm a and a scenario s, if the reference agent reaches its goal within the time limit without colliding with any agents or obstacles, the scenario is said to have successfully completed. In this case, mc (s, a) = 1 else mc (s, a) = 0. • Path Length. The path length ml (s, a) is the total distance traveled by the egocentric agent to reach its goal. • Total Time. The total time mt (s, a) is the time taken by the egocentric agent in reaching the goal. In addition, we compute optimal values of path length and total time to serve as an absolute reference that can be used to normalize the values of ml (s, a) and mt (s, a). The optimal opt path length, mopt l (s, a), and optimal time, mt (s, a), are the path length and time taken to travel along an optimal path to the goal by an algorithm a for a particular scenario s, ignoring neighboring agents. Using the optimal values, we can compute the ratio for a particular metric m(s, a) as follows: mr (s, a) = mopt (s, a) × mc (s, a). m(s, a) (1) The value of mr (s, a) is equal to 1 when the value of the metric is equal to its optimal value and is close to 0 when the value is far away from its optimal value. Also, mr (s, a) is only computed when the scenario has successfully completed. Using Equation 1, we can compute mrl (s, a) and mrt (s, a) to effectively quantify the performance of a steering algorithm for a particular scenario which can be compared across algorithms and scenarios. 5. Coverage, Average Quality and Failure Set In this section, we show how we use our representative scenario space and evaluation criteria to derive a set of welldefined, statistical metrics that characterize key aspects of a steering algorithm. a Scenario Set. The scenario set Sm (T1 , T1 ) for an algorithm a on a metric m is defined as the subset of all scenarios within the representative space of scenarios for which the value of m(s, a) is in the range [T1 , T2 ). a Sm (T1 , T2 ) = {s|s ∈ R(P) ∧ T1 <= m(s, a) < T2 }. (2) Using only T1 we can find the success set of an algorithm as the set of the scenarios for which the metric was greater than a threshold. Similarly, using only T2 allows us to define a failure set of an algorithm. The common failure set Sm (0, Tmin ) for all algorithms a ∈ A a is the intersection of the failure sets Sm (0, Tmin ) of all evaluated steering algorithms: Sm (0, Tmin ) = \ a Sm (0, Tmin ). (3) a∈A The common failure set can be used to identify particularly difficult scenarios. Coverage. The coverage cam of a steering algorithm a can be computed as the ratio of the subset of scenarios in the scenario space that a steering algorithm can successively handle with respect to a particular metric, m(s, a). cam = a |Sm (Tmax , 1)| , |R(P)| (4) where |S| denotes the cardinality of the set S. Average Quality. The average quality of a steering algorithm for a particular method of evaluation can similarly be c The Eurographics Association 2011. Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space computed as the average value of m(s, a) for all sampled scenarios. ∑ qam = m(s, a) a (T ,1) s∈Sm max |R(P)| . (5) Using Equations 4 and 5, we can compute coverage and average quality for ms (s, a), mrl (s, a) and mrt (s, a). Note that the coverage and average quality for ms (s, a) will be the same since it is a boolean value. The three concepts defined in this section provide a rigorous and objective statistical view of a steering algorithm. They can be intuitively used to evaluate the effectiveness of a single algorithm or to compare different approaches. 6. Results Using the concepts and evaluation method proposed in previous sections, we can now analyze and compare our four steering algorithms. All algorithms are tested on the same set of 10,000 scenarios randomly selected from the representative scenario space, R(P), with user defined parameters P = {r = 7, dx = dy = 10, nmin = 3, nmax = 6, s = 2.7}. In Section 3.3 we showed that the success rate of the four algorithms converges for test sets with 5,000-10,000 samples in the representative scenario space. This is a good indication that a test set of size 10,000 should be sufficient for our analysis. It takes our system a few minutes to run 10,000 samples (depending on the performance of the steering algorithm). 6.1. Coverage and Average Quality The coverage and average quality for each algorithm for all three metrics are given in Table 5 and Table 6. Note that the values of mrl (s, a) and mrt (s, a) are only considered when the algorithm successfully completes the scenario, i.e. mc (s, a) = 1. To compute coverage for mrl (s, a) and mrt (s, a), we specify the thresholds equal to the mean of the average quality for each metric computed for the three algorithms (Reactive is not considered). Thus, the coverage gives us a measure of the ratio of the number of scenarios that are above the average quality measure for that metric. Algorithm PPR Egocentric RVO Reactive mrl (s, a) 0.789 0.723 0.743 0.617 mrt (s, a) 0.683 0.63 0.731 0.586 Figure 5: The average quality qam of the steering algorithms for ratio to optimal path length, mrl (s, a), and ratio to optimal total time, mrt (s, a). Observations. We observe that the average quality of the algorithms for path length, mrl (s, a), is approximately 0.75. c The Eurographics Association 2011. Algorithm PPR Egocentric RVO Reactive ms (s, a) 0.583 0.568 0.591 0.459 mrl (s, a) 0.748 0.681 0.762 0.212 mrt (s, a) 0.608 0.515 0.662 0.178 Figure 6: The coverage cam of the steering algorithms for the three metrics. This implies that the three algorithms generally produce solutions with path lengths that are 75% of the optimal values. In contrast, the average quality of algorithms for total time, mrt (s, a), is approximately 0.68 which is considerably lower. This is because steering algorithms generally prefer to slow down instead of deviating from their planned paths. When comparing PPR and RVO, we notice that PPR has a better quality measure for path length than time. This is because PPR has a greater proclivity for predictively avoiding dynamic threats by slowing down if it anticipates a collision. Due to the variable resolution nature of the perception fields modeled in Egocentric, the trajectories produced by this method are curved and produce less optimal results. The performance of Reactive is reflected in its measure of coverage. We observe that Reactive can only solve 45% of the scenarios (compared to nearly 60% for the other 3 algorithms), and that only 20% of its solutions are above the average quality measure. 6.2. Failure Set The coverage and average quality provide a good aggregate measure of the performance of an algorithm over a large sample of scenarios and serve as a good basis of comparison. However, it is particularly useful to be able to automatically generate scenarios of interest where an algorithm performs poorly. Our framework automatically computes a failure set for an algorithm as the set of all scenarios where a particular metric falls below a threshold. Figure 7(a) and (b) measures the number of scenarios for which mrl (s, a) and a mrt (s, a) fall within a specified region. The set Sm (0, 0) clusters all scenarios for which the algorithm has failed to find a a solution (ms (s, a) = 0). The set Sm (1, 0) measures the number of scenarios for which the algorithms produced optimal solutions for mrl (s, a) or mrt (s, a). A small number of samples in this cluster is indicative that scenarios produced in the representative space are challenging and require complex intera a actions between agents. The sets Sm (0, 0.3) and Sm (0.3, 0.6) represent scenarios for which a steering algorithm generated highly sub-optimal solutions. We also find the common failure set Sm (0, 0) of all four steering algorithms. This set represents the set of scenarios for which no steering algorithm could find a solution. In these cases, the agents either reach a deadlock situation and time out or reach their goals by colliding with other agents. Figure 4 highlights some particularly challenging scenarios Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space (a) (b) (c) (d) Figure 4: Challenging scenarios sampled in the representative space that resulted in collisions or no solution. (a) (b) Figure 7: Failure sets of each algorithm for ratio to optimal path length mrl (s, a) and ratio to optimal time mrt (s, a). that fall within the common failure set. Note, that the narrow passageways in the figure are traversable. For 10,000 sample points, the cardinality of the failure set is |Sm (0, 0)| = 1, 710. This means that 17% of the scenarios that were sampled could not be successfully handled by any steering approach. By visually inspecting these scenarios, we arrive at the following generalization for particularly challenging scenarios: • Series of sharp turns Narrow passageways where agents had to make a sequence of sharp turns often resulted in soft collisions. • Complex Interactions Scenarios where the reference agent was forced to interact with multiple crossing and oncoming threats in the presence of obstacles often resulted in failure. • Deadlocks In certain situations, agents need communication and space-time planning to effectively cooperate on resolving a situation, such as one agent backing all the way up in a very narrow passage to allow another agent to pass first. 7. Conclusion and Future Work In this paper, we address the fundamental challenge of evaluating and analyzing steering techniques for multi-agent simulations. We present a method of automatically generating and sampling the representative space of challenging scenarios that a steering agent is likely to encounter in dynamically changing environments with both static and dynamic threats. In addition, we propose a method of determining coverage and quality of a steering algorithm in this space. We observe that the three agent-based steering approaches we examined are capable of successfully handling 60% of the scenarios that are in the representative scenario space. After examining their failure sets, we see that particularly challenging scenarios include combinations of oncoming and crossing threats in environments with limited room to maneuver, and situations where agents find themselves in deadlocks that require complex coordination between multiple agents. Steering approaches usually time out in these cases or allow collisions so that agents can push through the deadlocks. The work in [TCP06, GCC∗ 10] optimizes metrics such as c The Eurographics Association 2011. Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space path length, time, and effort in order to generate collisionfree trajectories in multi-agent simulations. It would be particularly interesting to see if steering methods that are based on optimality considerations have better coverage and quality using our method of evaluation. Another factor contributing to the low coverage of the evaluated methods is the nonholonomic control of the agents. Many nuanced locomotion capabilities of humans such as sidestepping and careful foot placement are not modeled by these approaches, which greatly limits their ability to handle challenging scenarios. Recent work in navigation [SKRF11] has addressed these limitations in an effort to better model the locomotion of virtual humans. However, modeling agents as discs is still common practice in interactive applications such as games. Our approach can be extended to handle different types of locomotion. This paper analyzes steering algorithms based on a particular parameterization of the scenario space that focuses on interactions between a small number of proximate agents. Further investigation is needed in order to determine the sensitivity of the evaluation based on these parameters. In addition, applications may require different scenario spaces, for example situations involving large crowds in urban environments. It would be particularly beneficial to design a specification language whereby users can specify and generate benchmarks that meet their requirements. Our current approach performs random sampling in this space in order to calculate the coverage of an algorithm. In the future, we would like to investigate adaptive sampling methods that use our evaluation criteria to identify and sample more densely areas of interest. Further analysis is also required to automatically cluster and generalize scenarios that are challenging for steering algorithms. Defining sub-spaces in this extremely high dimensional space that are of interest to the research community can prove valuable in the development of the next generation of steering techniques. 8. Acknowledgements [Feu00] F EURTEY F.: Simulating the Collision Avoidance Behavior of Pedestrians. Master’s thesis, The University of Tokyo, School of Engineering, 2000. [GCC∗ 10] G UY S. J., C HHUGANI J., C URTIS S., D UBEY P., L IN M., M ANOCHA D.: Pledestrians: a least-effort approach to crowd simulation. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (Aire-la-Ville, Switzerland, Switzerland, 2010), SCA ’10, Eurographics Association, pp. 119–128. [HBJW05] H ELBING D., B UZNA L., J OHANSSON A., W ERNER T.: Self-organized pedestrian crowd dynamics: Experiments, simulations, and design solutions. Transportation Science 39, 1 (2005), 1–24. [Hen71] H ENDERSON L. F.: The statistics of crowd fluids. Nature 229, 5284 (February 1971), 381–383. [HFV00] H ELBING D., FARKAS I., V ICSEK T.: Simulating dynamical features of escape panic. NATURE 407 (2000), 487. [KSA∗ 09] K APADIA M., S INGH S., A LLEN B., R EINMAN G., FALOUTSOS P.: Steerbug: an interactive framework for specifying and detecting steering behaviors. In SCA ’09: Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (2009), ACM, pp. 209–216. [KSHF09] K APADIA M., S INGH S., H EWLETT W., FALOUTSOS P.: Egocentric affordance fields in pedestrian steering. In I3D ’09: Proceedings of the 2009 symposium on Interactive 3D graphics and games (2009), ACM, pp. 215–223. [LCHL07] L EE K. H., C HOI M. G., H ONG Q., L EE J.: Group behavior from video: a data-driven approach to crowd simulation. In SCA ’07: Proceedings of the 2007 ACM SIGGRAPH/Eurographics symposium on Computer animation (Aire-la-Ville, Switzerland, Switzerland, 2007), Eurographics Association, pp. 109–118. [LCL07] L ERNER A., C HRYSANTHOU Y., L ISCHINSKI D.: Crowds by example. Computer Graphics Forum 26, 3 (September 2007), 655–664. [LCSCO10] L ERNER A., C HRYSANTHOU Y., S HAMIR A., C OHEN -O R D.: Context-dependent crowd evaluation. Comput. Graph. Forum 29, 7 (2010), 2197–2206. [LD04] L AMARCHE F., D ONIKIAN S.: Crowd of virtual humans: a new approach for real time navigation in complex and structured environments. In Computer Graphics Forum 23. (2004). [LMM03] L OSCOS C., M ARCHAL D., M EYER A.: Intuitive crowd behaviour in dense urban environments using local laws. In TPCG ’03: Proceedings of the Theory and Practice of Computer Graphics 2003 (Washington, DC, USA, 2003), IEEE Computer Society, p. 122. The work in this paper was partially supported by Intel through a Visual Computing grant, and the donation of a 32core Emerald Ridge system with Xeon processors X7560. In particular we would like to thank Randi Rost, and Scott Buck from Intel for their support. [Lov94] L OVAS G.: Modeling and simulation of pedestrian traffic flow. In Transportation Research Record (1994), pp. 429–443. References [McF06] M C FADDEN C.: Improving the QA Process, 2006. Games Developers Conference, Round Table. [BH97] B ROGAN D. C., H ODGINS J. K.: Group behaviors for systems with significant dynamics. Auton. Robots 4, 1 (1997), 137–153. [MRHA98] M ILAZZO J., ROUPHAIL N., H UMMER J., A LLEN D.: The effect of pedestrians on the capacity of signalized intersections. In Transportation Research Record (1998), pp. 37–46. [BMOB03] B RAUN A., M USSE S. R., O LIVEIRA L. P. L. D ., B ODMANN B. E. J.: Modeling individual behaviors in crowd simulation. In CASA ’03: Proceedings of the 16th International Conference on Computer Animation and Social Agents (CASA 2003) (Washington, DC, USA, 2003), IEEE Computer Society, p. 143. [PAB07] P ELECHANO N., A LLBECK J. M., BADLER N. I.: Controlling individual agents in high-density crowd simulation. In SCA ’07: Proceedings of the 2007 ACM SIGGRAPH/Eurographics symposium on Computer animation (Aire-la-Ville, Switzerland, Switzerland, 2007), Eurographics Association, pp. 99–108. c The Eurographics Association 2011. [LS02] L LOPIS N., S HARP B.: By the Books: Solid Software Engineering for Games, 2002. Games Developers Conference, Round Table. Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, Petros Faloutsos / Scenario Space [PAB08] P ELECHANO N., A LLBECK J. M., BADLER N. I.: Virtual Crowds: Methods, Simulation, and Control. Synthesis Lectures on Computer Graphics and Animation. Morgan & Claypool Publishers, 2008. [PPD07] PARIS S., P ETTRÉ J., D ONIKIAN S.: Pedestrian reactive navigation for crowd simulation: a predictive approach. In EUROGRAPHICS 2007 (2007), vol. 26, pp. 665–674. [PSAB08] P ELECHANO N., S TOCKER C., A LLBECK J., BADLER N.: Being a part of the crowd: towards validating vr crowds using presence. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems Volume 1 (2008), AAMAS ’08, pp. 136–142. [Rey87] R EYNOLDS C. W.: Flocks, herds and schools: A distributed behavioral model. In SIGGRAPH ’87: Proceedings of the 14th annual conference on Computer graphics and interactive techniques (1987), ACM, pp. 25–34. [Rey99] R EYNOLDS C.: Steering behaviors for autonomous characters, 1999. [RMH05] RUDOMÍN I., M ILLÁN E., H ERNÁNDEZ B.: Fragment shaders for agent animation using finite state machines. Simulation Modelling Practice and Theory 13, 8 (2005), 741–751. [RP07] R EITSMA P. S. A., P OLLARD N. S.: Evaluating motion graphs for character animation. ACM Trans. Graph. 26 (October 2007). [SGA∗ 07] S UD A., G AYLE R., A NDERSEN E., G UY S., L IN M., M ANOCHA D.: Real-time navigation of independent agents using adaptive roadmaps. In VRST ’07: Proceedings of the 2007 ACM symposium on Virtual reality software and technology (2007), ACM, pp. 99–106. [SKFR09] S INGH S., K APADIA M., FALOUTSOS P., R EINMAN G.: An open framework for developing, evaluating, and sharing steering algorithms. In Proceedings of the 2nd International Workshop on Motion in Games (Berlin, Heidelberg, 2009), MIG ’09, Springer-Verlag, pp. 158–169. [SKHF11] S INGH S., K APADIA M., H EWLETT W., FALOUTSOS P.: A modular framework for adaptive agent-based steering. In Proceedings of the 2011 symposium on Interactive 3D graphics and games (2011), I3D ’11, ACM. [SKN∗ 09] S INGH S., K APADIA M., NAIK M., R EINMAN G., FALOUTSOS P.: SteerBench: A Steering Framework for Evaluating Steering Behaviors. Computer Animation and Virtual Worlds (2009). http://dx.doi.org/10.1002/cav.277. [SKRF11] S INGH S., K APADIA M., R EINMAN G., FALOUTSOS P.: Footstep navigation for dynamic crowds. In Symposium on Interactive 3D Graphics and Games (New York, NY, USA, 2011), I3D ’11, ACM, pp. 203–203. [ST05] S HAO W., T ERZOPOULOS D.: Autonomous pedestrians. In SCA ’05: Proceedings of the 2005 ACM SIGGRAPH/Eurographics symposium on Computer animation (2005), ACM, pp. 19–28. [TCP06] T REUILLE A., C OOPER S., P OPOVI Ć Z.: Continuum crowds. ACM Trans. Graph. 25, 3 (2006), 1160–1168. [vdBLM08] VAN DEN B ERG J., L IN M. C., M ANOCHA D.: Reciprocal velocity obstacles for real-time multi-agent navigation. In Proceedings of ICRA (2008), IEEE, pp. 1928–1935. [vdBPS∗ 08] VAN DEN B ERG J., PATIL S., S EWALL J., M ANOCHA D., L IN M.: Interactive navigation of multiple agents in crowded environments. In SI3D ’08: Proceedings of the 2008 symposium on Interactive 3D graphics and games (2008), ACM, pp. 139–147. c The Eurographics Association 2011.