Academia.eduAcademia.edu

A GML shape grammar for semantically enriched 3D building models

2010

The creation of building and facility models is a tedious and complicated task. Existing CAD models are typically not well suited since they contain too much or not enough detail; the manual modeling approach does not scale; different views on the same model are needed, and different levels of detail and abstraction; and finally, conventional modeling tools are inappropriate for models with many internal parameter dependencies. As a solution to this problem we propose a combination of a procedural approach with shape grammars. The model is created in a top-down manner; high-level changeability and re-usability are much less of a problem; and it can be interactively evaluated to provide different views at runtime. We present some insights on the relation between imperative and declarative grammar descriptions, and show a detailed case study with facility surveillance as a practical application.

Computers & Graphics 34 (2010) 322–334 Contents lists available at ScienceDirect Computers & Graphics journal homepage: www.elsevier.com/locate/cag Technical Section A GML shape grammar for semantically enriched 3D building models Bernhard Hohmann a, Sven Havemann a,, Ulrich Krispel a, Dieter Fellner a,b a b Institute of Computer Graphics and Knowledge Visualization (CGV), Graz University of Technology, Inffeldgasse 16c/II, 8010 Graz, Austria Fraunhofer IGD & Darmstadt University of Technology, Fraunhoferstrae 5, 64283 Darmstadt, Germany a r t i c l e in f o Keywords: Shape grammars Architectural buildings Shape modeling 3D-reconstruction Shape semantics Generative modeling a b s t r a c t The creation of building and facility models is a tedious and complicated task. Existing CAD models are typically not well suited since they contain too much or not enough detail; the manual modeling approach does not scale; different views on the same model are needed, and different levels of detail and abstraction; and finally, conventional modeling tools are inappropriate for models with many internal parameter dependencies. As a solution to this problem we propose a combination of a procedural approach with shape grammars. The model is created in a top-down manner; high-level changeability and re-usability are much less of a problem; and it can be interactively evaluated to provide different views at runtime. We present some insights on the relation between imperative and declarative grammar descriptions, and show a detailed case study with facility surveillance as a practical application. & 2010 Elsevier Ltd. All rights reserved. 1. Introduction 1.1. Existing architectural software is inappropriate Three-dimensional building and facility models are becoming ever more important with the widespread availability of building information systems. They are the basis for home automation, sensor networks, technical supervision and inspection, building surveillance systems as well as all kinds of maintenance tasks in the context of building lifecycle management. A central problem is scalability: Especially for complex facilities, the creation of appropriate 3D models is an extremely tedious task. Conventional interactive modeling software a la 3D House Designer, or general purpose software such as Maya or 3D Studio Max is suitable only for comparably small facilities. For a surveillance project we needed a 3D model of a sufficiently complex facility, four university buildings (cf. Fig. 1). We made the first attempt for interactive reconstruction using Google SketchUp. The biggest problem we encountered was that errors early in the modeling process can practically not be corrected at all later on. If, e.g., the floor height must be changed, essentially the whole construction needs to be re-done. Another option we thought of was to use architectural software such as ArchiCAD, Autodesk Revit or Architectural Desktop. However, this software is targeted at construction rather than reconstruction, i.e., reverse engineering of buildings. But even these so-called ‘‘associative models’’ are low-level since almost all of the geometry is constructed manually. Only certain parameter associations are kept consistent automatically, i.e., when dimensions of the model are changed this is (somehow) propagated to sub-parts. Architectural software cannot easily be integrated into, e.g., a surveillance application. But when a building model is exported, it loses much of its semantics. The CAD export contains (too) detailed geometry of walls, doors, etc., but there is no geometry for rooms and corridors that are just empty space. But this space is just where people live and it is therefore the main unit for surveillance. One viable alternative might be exporting the CAD model to a format with rich semantics such as IFC International Alliance for Interoperability [2]. It is apparently the upcoming open exchange standard for building semantics, and it supports labels for rooms, corridors, etc.  Corresponding author. E-mail addresses: b.hohmann@cgv.tugraz.at (B. Hohmann), s.havemann@cgv.tugraz.at (S. Havemann), u.krispel@cgv.tugraz.at (U. Krispel), d.fellner@cgv.tugraz.at (D. Fellner). URLS: http://www.cgv.tugraz.at/hohmann (B. Hohmann), http://www.cgv. tugraz.at/havemann (S. Havemann), http://www.cgv.tugraz.at/krispel (U. Krispel), http://www.cgv.tugraz.at/fellner, http://www.igd.fhg.de/igd-a0/staff/fellner (D. Fellner). 0097-8493/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.cag.2010.05.007 1.2. Needed: procedures and complex parameter dependencies The focus of our work, however, was on creating a sustainable building model with a minimum of initial effort in terms of taking manual measurements. The model should allow replacing anytime later guessed values by accurate measurements taken with our laser range finder (Fig. 2). So the idea was to start with a B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 323 Fig. 1. (top) University facility: four buildings, three floors, 405 rooms. Photo: Bing Maps, Bird’s Eye view [1]. (bottom) Reconstruction attempt from photographs using Google SketchUp, with obvious scalability problems. 1.3. Main contributions  Simple but powerful split grammar formalism: Our GML shape  Fig. 2. Leica Disto laser range finder with 1–2 mm precision over 60 m. model that may be geometrically inaccurate but is qualitatively correct, meaning that all of the major building features (walls, doors, windows, etc.) are present. It should be possible to generate them quickly by just instantiating parametric templates. High-level changeability was to be granted by the ability to express any sort of parameter dependencies explicitly. For example, in our building a sequence of rooms along a corridor all have the same height and width, but the room length (along the corridor) can vary. So it needs to be possible to define the model in such a way that only one parameter per room is required, namely its length, and only a single width value per corridor, which then each room refers to as its width parameter. We found also more complex dependencies that require, e.g., changing the reference frame when a measurement does not correspond directly to a parameter. There are surprisingly many possible parametrizations of a shape as simple as a box: pmin, pmax; pmid, (rx, ry, rz); or just the midpoint of a bottom edge, and orientation and extents are inherited from higher levels. This example shows that no limited number of pre-defined parametrizations can ever be sufficient; instead it must be possible to re-parameterize the boxes whenever needed. A more involved example is given in Section 5.3.   grammar toolkit develops considerable expressiveness out of about two dozen functions. Unified view on grammars and imperative modeling: Grammars are typically perceived as declarative, but we use them within an imperative paradigm. This permits us to overcome some inherent limitations. Practical reconstruction of a complex facility with interiors: The original motivation was to rapidly create a complex facility model that should still be sustainable. A solution to the problem of propagating reference frames: In many cases measurements do not directly correspond to model parameters. We present a simple, general solution to this problem. 2. Related work Since ancient times complex architectural buildings were designed in a top-down, coarse-to-fine manner: The overall structure is divided into sections and floors, which are further refined into rooms, hallways, stairways, etc. A more formal view on this process suggests a grammar-based approach, which directly leads to the so-called shape grammars. They were first introduced by Stiny and Gips as early as 1972 [3]. Their very general idea was to simply replace a shape (or part of a shape) that carries a label by one or more (usually smaller) shapes carrying other labels. This hierarchical replacement process is specified by a finite set of shape grammar rules. Many variations to this basic idea have been developed, from non-determinism over conditional rules to L-systems [4], e.g., for 324 B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 plants. In architecture, shape grammars were a long time used only as theoretical research tool. New interest was triggered by Wonka et al. in 2003 who introduced split grammars [5]. They formed the basis for the CGA Shape grammar system for the procedural modeling of buildings from Müller et al. [6]. Split grammars are specialization of shape grammars. The main shape building block is a box (a scope), which can be split into smaller boxes along any of the three principal axes. Our system borrows the scope concept from CGA Shape, but instead of a context-free grammar we use a formally equivalent but more flexible procedural, function-based description: Instead of replacing an object by a sequence of objects, we ‘‘replace’’ an operation by a sequence of operations (Section 3.2.4). CGA Shape was further developed into a commercial tool, the CityEngine [7], which marks the state of the art in shape grammars. However, it focuses more on the randomized generation of large scale city models than on detailed buildings; the CityEngine does not consider interiors at all. Another difference is that the CityEngine uses scopes mainly as non-terminal symbols, terminal rules replace scopes by pre-modeled parts loaded from a file. Our system leaves out this final replacement. In the context of large scale city models, an important research focus is the reconstruction of detailed facades. The grammar approach was very inspiring for solving inverse problems, as witnessed by a number of recent contributions. CGA Shape was used for image-based reconstruction of very regular, repetitive facades, which can be automatically reconstructed even from a single orthophoto [8]. The goal of our evolving system, however, is to reconstruct 80% of a whole city from massive input data [9]. Koutsourakis et al. presented another single view reconstruction approach for facades of a few different styles using orthophotos and Markov random fields (MRF) [10]. Xiao et al. proceed in a similar fashion and obtain impressive results using structure from motion [11]. Interestingly, however, they do not need any grammars. Instead they make the strong assumption that buildings are composed of block geometry, which they obtain from partitioning an orthographic depth map into rectangular regions. One fundamental problem of grammars is to deal with exceptions in a regular structure. Lipp et al. have introduced semantic locators to allow selecting a column of windows even if floors were split first [12]. With exact locators they also tackle the persistency problem, i.e., how to remember local changes in a subtree if the tree above is interactively changed. They also present an interactive system to create and edit a shape grammar for building exteriors without manual coding, which is the drawback of all scripting-based systems such as CGA Shape. Our system is also scripting-based, but we offer tools for semiinteractive inspection of the grammar evaluation. Our goal is as well to create grammar descriptions (a grammar script) not through scripting as we did for this paper, but either interactively in 3D or completely automatically. Unlike Lipp et al., however, we do not want to define only replacement rules, but also imperative procedures interactively—because replacement rules alone are apparently not sufficient. In fact there is a recent trend to combine grammar-based approaches with other modeling methodologies; grammars become one tool among others. Larive et al. for instance create the basic building geometry by extruding an arbitrarily shaped ground polygon (as does the CityEngine). A simplified split grammar is used only for detailing the facade in a 2.5D fashion using a simplified set of five rules: two repetition rules (grid/list), two position rules (extrude/border) and one terminal rule for texturing [13]. But are context free grammars really sufficient for formalizing architecture? Already in 1965, Christopher Alexander noted in his famous article that A city is not a tree [14]—and his arguments equally apply to buildings. Most shape grammars are complemented by another language for scripting. Certain tasks (calculations, conditionals, function calls) are easier to formulate as procedures than as replacement operations. The relation between declarative grammars and imperative scripting is not completely clear; but combining them seems indispensable (see Section 4). Complex classical facades exhibit structures that are difficult to generate by hierarchical replacement. Cornices and ledges ‘‘run around’’ a building, connecting many different facade elements. In that case, a procedural approach is more suitable, as demonstrated by Finkenzeller [15]. His floor plan modules are composed of convex polygons, from which an outline is computed that serves as path for extruding a profile that is also generated procedurally using Turtle graphics. One goal of any procedural shape description—be they imperative or declarative—is the search for the minimum description length (MDL). Aliaga et al. for instance focus on the detection of repetitive structures and patterns to automatically generalize, e.g., an ABABABA pattern to (AB)*A in order to transfer the style of one building to another [16]. Ripperda et al. [17] use the reversible jump Markov chain Monte Carlo method (rjMCMC) to find the best description in the MDL sense. Consequently, their grammar also has a mirror operator (not present, e.g., in the CityEngine) because symmetries are very efficient in reducing the description length. Interestingly, they use a functional infix notation for their grammar, showing that it is not necessary to formulate a (context free) grammar using replacement rules, but replacement rules can generally be understood as function calls. This bridges the gap between imperative and declarative descriptions, which we exploit for our (imperative) approach. 3. GML based shape grammar This section introduces an efficient formalism for formulating shape grammars using the Generative Modeling Language (GML) from Havemann [18]. It is an imperative programming language, but as a stack-based language it is particularly well suited as notation for context-free grammars. 3.1. GML briefly explained GML follows the stream of tokens concept, i.e., a GML program consists of a sequence of tokens that are evaluated one after another. Tokens either contain data, which are put on the stack (the operand stack), or processing instructions (operators and function calls), which are executed. An operator pops its input parameters from the stack, processes them, and pushes the result back on the stack. The add operator for instance pops two numbers (or vectors), adds them, and pushes their sum. The def operator performs a variable assignment. It pops a literal name (preceded by a slash) and a token from the stack and enters it in the current dictionary; it pushes no results. When an executable name (i.e., without a slash) is executed, the interpreter looks it up and executes the token found, which may be a function. Name lookup uses a central dictionary stack; a dictionary is a list of (nameID,token) pairs, the topmost dictionary on the dictionary stack is the current dictionary, into which the def operator also writes. This way the infix assignment x¼(1+ 2)  (5+ 6) can be expressed in GML as =f faddg def =x 1 2 f 5 6 f mul def 325 B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 Arrays are created by the ] operator which searches on the stack top-down for the first [ and creates an array from the tokens in between. Functions are created similarly but using { and }; so functions are just executable arrays. GML is a functional language in the sense that functions are first level citizens and are often created at runtime; however, it does not enforce any referential integrity. Instead, many operators do have side effects, i.e., manipulate internal data structures. GML inherits this property from Adobe PostScript [19], to which it is syntactically almost identical. The only language extensions are registers, and path expressions to navigate through dictionary hierarchies. And of course, GML is for 3D rather than 2D, so it has no typesetting operators but many operators for shape design. For more information refer to the GML homepage [20] that also links to the GML wiki with descriptions and examples for all operators. 3.2. The GML shape grammar 1 5 scope 2 3 4 5 6 7 8 9 10 − 5 /X subdivide terminal − box terminal − void − 3 /Y subdivide terminal − void terminal − box terminal − void terminal − void terminal − box Fig. 3. GML shape grammar code example 1: illustrates the successive subdivision of a scope, using relative subdivide. The indentation is important to keep track of the hierarchy level, which is not enforced automatically. /A { terminal− box } def /B { set− material terminal− box } def 3 /Frame { [ 0.05 − 1 0.05 ] /Y split 4 A 5 [ 0.05 − 1 0.05 ] /X split 6 A 7 12 B 8 A 9 A 10 } def 1 2 When evaluating a grammar, the replacement process proceeds by successively applying replacement rules to a start symbol. For deterministic context-free grammars, the replacement process forms a tree, which can in principle be traversed in any order. When interpreting the rules as functions, the call graph corresponds to a depth-first evaluation. This is typically implemented using a stack that contains the unfinished calls (‘‘open scopes’’). Since GML is stack-based, it is natural to push the open scopes in-order on a scope stack (which is separate from the normal operand stack). The last pushed scope is the current scope, which is the next scope to be processed. This section introduces the main functions of the shape grammar toolkit. The next section will show how to apply them in a practical reconstruction. 3.2.1. The basic split operations The shape grammar toolkit is a library of GML functions to create, modify, and terminate scopes (i.e., boxes). Fig. 3 shows a first example. The scope function pops a material ID, creates a new scope with default position (origin) and unit size, and pushes it on the scope stack to become the current scope. The subdivide function expects a number a and a split direction d (in positive or negative X, Y, or Z direction). It splits the current scope s into a number of smaller scopes: In case a is positive, it produces as many sub-scopes as possible that are not smaller than a in direction d, and have the same extent as s in the other directions. In case a is negative, s is split into jaj equal parts (a is rounded to the nearest integer). In any case the sub-scopes are a volumetric partition of s (containment property). Instead of subdividing it, the current scope can be mapped to empty space using terminal-void, or to a solid box using terminalbox; a third alternative is just to leave it open. In grammar terminology, terminal symbols are those to which no further replacement rule applies. Consequently, the current scope is popped from the scope stack, and the next open scope becomes the current one. The indentation in the code in Fig. 3 reflects the subdivision hierarchy for a better overview. The indentation is essential for users to keep track of the open scopes: Line 2 produces five open scopes, and after line 5, again five scopes are open (three were consumed, three are newly produced, another two remain). Note that we have deliberately chosen not to introduce a closing statement; deleting, e.g., line 6 will mess up the model. Although this seems brittle, it is in practice surprisingly easy to debug, as we can stop the grammar execution anywhere (see Section 3.4). So we decided not to use a closing statement which would make the code safer but less elegant. 5 scope 0.23 /X subdivide 13 { Frame } repeat 11 12 Fig. 4. Grammar rules naturally correspond to GML functions. This example uses an absolute subdivide; the split operator expects on the stack a scope, an array of absolute ð4 0Þ and relative ð o 0Þ distances, and a split direction. 3.2.2. Defining grammar rules in GML The replacement rules are in fact just ordinary GML function definitions. From a formal point of view, this can be seen as a simple syntax conversion: T - PQR =T f P Q R g def traditional notation for replacement rules GML notation for replacement rules Whenever the GML interpreter encounters during execution the executable name T, it looks it up and executes the object found, in this case an executable array (i.e., the function). So executing T is exactly equivalent to executing P, then Q and R. In that sense, GML has a built-in replacement property. Note that grammar rules, as normal functions, may operate not only on the scope stack, but also on the operand stack. In the example in Fig. 4, rule B first calls the set-material function that expects an integer as material ID on the operand stack to modify the current scope; other possible scope modifications include translation, scaling, and extrusion. B ‘‘inherits’’ the signature of set-material: As set-material expects an integer on the operand stack, so does B (line 7). The split operator used in the Frame rule is the second way of partitioning a scope into sub-scopes. It expects an array of numbers specifying the extents of the sub-scopes in the split direction. Positive numbers mean absolute sizes, while negative numbers mean relative sizes. Thus, [  1  2] /X split and [  2  4] /X split both produce the identical two sub-scopes, the current scope being twice as large in X than the next-current scope. Lines 1–10 only define rules; lines 11–13 create the model. As the scope from line 11 has unit size, the subdivide of 0.23 326 B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 produces four equally sized scopes of size 0.25. subdivide has the additional property that it leaves the number of generated subscopes on the stack. This is handy for the GML operator repeat which expects an integer n and an object x (typically a function) on the stack, and executes x n times. Note, however, that x should not leave any open scopes on the scope stack, therefore the Frame rule terminates the scope. 3.2.3. Grammar comparison: GML vs. CityEngine The CGA Shape grammar of the CityEngine [7] uses a different syntax but provides the same functionality. Fig. 5 shows the example from Fig. 4 in CityEngine syntax. Relative distances are denoted here using a tilde  , repetitions using the asterisk  . 3.2.4. A shape grammar for box frames This example is a bit more complex and provides more insight to the replacement process (Fig. 6). Seven rules are defined, then the model is built in line 8 by applying rule C to the start scope. The image below the code suggests a decompositional view on the rules in a top-down order: C is composed of E,D,E stacked on top of each other, D in turn is made of F, void, F, and so on. All in all the replacement sequence in the example is as follows (discarding the split directions), finally resulting in filled boxes and 1 2 A → color(Turquoise) B → color(Black) empty space: C - EDE - GFG FBF - AAA ABA ABA B AAA ABA GFG AAA ABA AAA The process can also be understood bottom-up: G splits the scope in z-direction in two fixed sized outer parts and an inner part, which are then all replaced by a box (rule A). Rule F splits the same way as G, but it deletes the inner part. Rule E combines F and G very much like the Frame rule from Fig. 4. The only difference is that G explicitly splits off the corners of the frame, resulting in four more scopes than in Frame. Also note that rules D and E split in Y-direction and rule C in X-direction. Another important shape function, especially for architecture, is extrusion (extend tool in Table 1). One of the most obvious examples is window sills, as they are sticking out of the main plane of the facade. Without extrusion one would have to peel off a layer of the facade in every instance of the facade except the window sills (cf. Fig. 7). This is a fundamental problem of all approaches based on recursive refinement. Table 1 The shape grammar infrastructure functions, contained in Shape-GrammarTools. Frame → split (y){ 0.05 : A 5 | ˜1 : split (x){ 0.05 : A 6 | ˜1 : B 7 | 0.05 : A } 8 | 0.05 : A } 3 4 9 10 StartScope → split (x){ { ˜0.23 : Frame }∗ } Fig. 5. CityEngine code example. Illustrates how the GML shape grammar example from Fig. 4 looks in CityEngine syntax. 2 /A { terminal− box } def /B { terminal− void } def 3 /C { [ 0.1 − 1 0.1 ] /X split E D E } def 4 /D { [ 0.1 − 1 0.1 ] /Y split F B F } def 5 /E { [ 0.1 − 1 0.1 ] /Y split G F G } def 6 /F { [ 0.1 − 1 0.1 ] /Z split A B A } def 7 /G { [ 0.1 − 1 0.1 ] /Z split A A A } def 8 7 scope (2,1,1) scale C 1 materiallD:Int s:Scope s:Scope [s_i :Scope] offsetvec: P3 factor: Num transvec: P3 d:Num / dir:Name width:Num / dir:Name [ width_i:Num ] / dir:Name materiallD:Int /key:Name value:(Any) Fig. 6. GML replacement rules: the modeling process can be understood both as replacement (top-down decomposition) and as assembly (bottom-up). Note the orientation of the coordinate axes. s:Scope / dir:Name d: Num init finish scope scope-copy pop-scope push-scope append-scopes current-scope move scale translate extend split subdivide terminal-void terminal-box set-material label sd reldist - t:Scope - c:Scope - c:Scope - The /dir name can be one of /X, /Y, /Z, /NX, /NY, /NZ to denote positive and negative scope axis directions. Note that the signature is only informal, for instance the type Scope formally is a GML Dictstack object. Functions like terminal-box have no effect on the operand stack because they operate on the scope stack. Fig. 7. Windows with extruded sills. Without extrusion, it would be necessary to split off a layer of the facade in every scope except for the window sills. B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 3.3. Some implementation details of the toolkit 327 usereg pop− scope !scope 3 :scope begin 4 materials mat get setcurrentmaterial 1 2 The implementation of the scope and scope-copy functions is shown in Fig. 8. GML dictionaries can be used very much like classes in object-oriented programming. They can contain both data and functions and it is even possible to replace at runtime data by functions; in that sense, dictionaries are dynamic classes. Each scope is represented by one dictionary and a dictstack object. The dictstack is a copy of its parent dictstack, and the new dictionary is appended to become its top (current). Like dictionaries, dictstack objects can be put on the central GML dictionary stack using the begin operator. This makes it particularly easy to propagate all sorts of semantic attributes through the grammar, as explained in detail in Section 5.4. The implementation of the terminal-box function is shown in Fig. 9. The pop-scope function pops the current scope from the scope stack and pushes it on the operand stack. After the box is created, the scope is set to inactive, as it was consumed by a terminal symbol. Since GML as a stack-based language is well suited for evaluating context-free grammars, the GML shape grammar toolkit is indeed quite lean. The whole grammar infrastructure consists of only around 20 functions (see Table 1) and fits in 11 kB of GML code. The functions that are not explained in detail here will hopefully be clear from the context. /scope { usereg !mat 3 dict dictstack !scope 6 [ p000 p100 p110 p010 ] 3 poly2doubleface 0 dz getY dz getX sub 5 vector3 extrude pop 7 / active 0 def 5 8 end Fig. 9. Code of the terminal-box function that terminates the current scope. It creates a double-sided quad and extrudes it to obtain a box. 1 2 4 :scope begin 5 / self :scope def 6 / pt (0,0,0) def 7 /ex (1,0,0) def /ey (0,1,0) def /ez (0,0,1) def 8 9 10 11 12 13 14 15 /dx (0,1) def /dy (0,1) def /dz (0,1) def /mat :mat def / active 1 def end scope − array :scope append :scope push− scope 18 } def 16 17 /scope − copy { usereg !scope 21 :scope dictstack− copy !newscope 22 dict :newscope dictstack− begin 19 20 :newscope /self :newscope put :newscope /active 1 put 25 :newscope 26 } def 23 24 Fig. 8. The scope and scope-copy functions are the basis of the GML shape grammar. A scope consists mainly of a coordinate frame and three intervals (represented as 2D points). The !x and :x are the syntax to set and get named registers (local variables). The begin operator simply pushes a dictionary (or dictpath) on the dictionary stack, which is used for name lookup. The def operator writes into the topmost dict on the GML dictionary stack. In case this is itself a dictstack object, it writes into the topmost dictionary of the dictstack. Fig. 10. Interactive code development. Code execution is halted by introducing a stop function. The open scopes are shown in color (current scope is red). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) 3.4. Code development in practice A fundamental problem for all scripting-based procedural modeling approaches is that a text editor is not the ideal interface to 3D modeling. We deal with this problem by using an interactive IDE with 3D window next to the code window. By inserting a stop command, the recursive replacement is halted, and the open scopes on the scope stack are shown in different colors according to their stack position (see Fig. 10). 4. A unified higher-level view GML is an imperative language whereas grammars are often described as belonging to the declarative paradigm. This subtle difference often leads to confusion on what exactly the difference in expressiveness is. Formally, the rules of a grammar specify how one symbol is replaced by other symbols (including the empty symbol). Shape grammars, as originally described by Stiny in [3], are context-free. The general concept is that a ‘‘larger’’ shape is replaced by a collection of sub-shapes. When the shapes carry a symbol, this process can be formally described as symbol replacement process where the geometric operations are, strictly speaking, side effects. The grammar itself has no side effect, i.e., the sub-triangles of a Sierpinski triangle are completely independent and ignorant of their (geometric) neighbors. Split grammars, most notably CGA Shape, are a special case of shape grammars, in that the geometry representation is an n-dimensional rectangular box, a scope, that carries the symbol. The main replacement rules are split, subdivide (repeat), and component split. CGA Shape actually breaks the grammar 328 B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 formalism in that it deliberately introduces side effects: Rule execution can be influenced by geometric reasoning, e.g., to determine whether the neighboring house is too close to insert windows into a wall of a building. The classical rule replacement syntax was extended to accommodate, e.g., for geometric conditions and statistical rule selection. The GML shape grammar approaches the problem from a different side, as the grammar is in fact just an ordinary GML program. The notation, e.g., in Fig. 6 and in Section 3.2.3 was chosen such that it resembles grammar rules. The truth, however, is that GML as a stack-based language is just ‘‘by coincidence’’ well suited to express shape grammars that typically also use a stack for evaluation; in this case, the execution stack of the GML interpreter. A GML shape grammar belongs to a specific class of GML programs, namely that perform strict in-order processing of the symbols (scopes) on the scope stack. All introductory examples are of this type. The great advantage is now that we can gradually introduce elements from more general GML programs to realize geometric queries (Section 5.3) or re-ordering. Consider for example a wall rule that is supposed to leave a symbol for the window to be processed by a subsequent rule. Strict in-order processing would require defining a rule for the window before executing the wall rule. This is a problem (and a limitation of CGA Shape) when different window styles are to be used (see Fig. 11). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 /Void { terminal− void } def /Box { set− material terminal− box } def /Window− A { 0.1 0.5 5 Frame [ − 2 − 1 ] /Z split 0.05 0.35 9 Frame Void 0.05 0.15 9 Frame Void } def /Window− B { 0.1 0.5 5 Frame [ − 1 − 1 ] /X split 0.05 0.15 9 Frame Void 0.05 0.15 9 Frame Void } def 1 :depth 1 vector3 !dv 18 [ :width − 1 :width ] /X split :dv scale : material Box [ :width − 1 :width ] /Z split :dv scale : material Box pop− scope !center :dv | scale | : material Box :dv scale : material Box 20 21 22 23 24 25 To demonstrate that our grammar formalism is also useful in practice, a modern university building complex was chosen as a use case. The whole complex consists of four buildings which look very similar at first glance (cf. Fig. 1). Furthermore, most constructional elements are box-like shapes. Thus, it seemed to be the perfect example to try out and demonstrate the advantages of our shape grammar. As we start the modeling process with one big volume, we first have to find the super structures which span over the whole complex. Then further regularities have to be identified. For that purpose we used the available inaccurate floorplans of the buildings, e.g., Fig. 13(a). The first question is where to put the major splits through the building. We chose to first split apart the four buildings, and then partitioned them into segments. Fig. 13(b) shows the model of the whole facility consisting of four connected buildings. As can easily be seen on the floorplan, in Fig. 13(a), the five bridges connecting the four buildings split the buildings into six segments. These segments are denoted as A, . . . ,F. The smaller parts between them are the bridge segments. To observe the splitting process more clearly, the evaluation of the shape grammar was stopped at levels 2, 4, 6, 9, 11 of the 16c building (Fig. 12). The building is first split into four segments, C, . . . ,F, which are separated by three bridge segments. The segments are split into north, open space, and south. These two :center push− scope 26 } def 27 3 scope (0.8,0.3,2.2) scale Wall Window− A 3 scope (0.8,0.3,2.2) scale (1,0,0) move Wall Window− B 28 5. Sample reconstruction /Frame { usereg !material !depth !width 17 19 An inherent limitation of the GML shape grammar approach is, of course, that context-sensitive grammars are much simpler to describe with rules like ABA - XYZYX. This is not so easy to map to GML. However, it is currently an open question how ‘‘general’’ GML shape grammars can be without breaking the grammar idea completely; and whether the expressiveness of context-sensitive grammars can be reached conveniently. Interestingly, most shape grammar languages are context-free. Fig. 11. Re-ordering of symbol processing. The pop-scope function transfers a scope from the scope stack to the normal GML operand stack, the push-scope transfers it back. So the Frame rule can leave the frame center as open scope for further processing, e.g., by other Frame rules. The Wall rule (not shown) proceeds similarly. This way the center can be processed in different ways without changing the Frame rule, or defining a Center rule beforehand. This is an important advantage in rule modularization over usual shape grammars that evaluate strictly in-order. Fig. 12. The building 16c at different split levels L, i.e., the depth in the split hierarchy. (a) L ¼ 2, (b) L ¼ 4, (c) L ¼ 6, (d) L ¼ 9, (e) L ¼ 11. B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 329 Fig. 13. (a) Floorplan of the second floor of the four-building university complex, (b) model of the building complex. Fig. 14. (a) The building 16c, (b) segment E, one of the four segments of 16c, (c) the second floor of the segment. consecutive splits yield the split level 2 in Fig. 12(a). Next the building is split into floors. At level 4, bridges are extended, staircases are split off the office strips, some of the floors are inserted, and cavities are split off. At level 6, the cavities are set to empty, and office superstructures and most corridors are present. At level 9, all office scopes are replaced by Office. The boxes carry the semantic information that a room is an office. Furthermore, all corridors are split off the offices. At level 11 only windows and doors are still missing. As the structure of these elements is more complex it takes until level 16 until the building is completely evaluated (cf. Fig. 14(a)). Furthermore, segment E is depicted in Fig. 14(b) and its second floor is shown in Fig. 14(c). 5.1. Modeling the use case At first glance the segments look very similar and we expected to use a lot of copy and paste. However, a closer look revealed that almost no segment, in or across any of the four buildings, is the same as another. The structure within the segments, on the other hand, is very regular. There is one strip of offices along the north and one strip along the south side of the segments (cf. Fig. 14(c)). In front of the offices there are corridors and between them there Fig. 15. The southern side of the floor shown in Fig. 14(c). In this exploded illustration it is easy to see how this part is split into office and wall slices. is open space which facilitates cross floor communication. On upper floors there are bridges spanning over this open space. A floor is split into north, open space and south. The open space is just floor on the ground floor and on upper floors it is split into open space and bridges. The north and south are split into office slices and walls (cf. Fig. 15). These slices are composed of a corridor, a wall and an office part. The wall part is the wall between the office and the corridor and is usually replaced by a wall or a wall with a door. The corridor part is replaced by a simple corridor, a corridor with balustrade or a passageway. The office part is again decomposed into room and outer wall, where the latter is split into windows and walls. Except for the staircases, this corridor-wall-office-split, the CWO-split, is applied for every office-slice in the whole complex. This is possible because all the replacement rules are parametrizable. 330 B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 The implementation of CWO-split (cf. Fig. 16) shows that inserting windows in the outer wall of the office with the rule E3DoubleWindow takes only four parameters. This is possible because the windows are all of the same type and at the same height, only the margins from the office walls left and right of the window are varying. In the example, the lower and the upper windows both have the same margin of 1 m from the left and 0.5 m from the right. Another example in the code of CWO-split is the E2Door rule. Its three parameters are the margin left, the width of the door, and the margin right. In the example a door of width 1.0 m is placed in the middle of the wall, as the margins are defined to be the same relative distance. This shows the importance of parametrizable rules in shape grammars. By abstracting slightly varying similar forms into a parametrizable rule, a very compact description of a complex scenery is possible. Looking at the office strip in Fig. 15 it does not seem to be such a good idea to combine the corridors with the offices, because the ballustrade is chopped into many parts. None of these splits is really necessary for positioning the bridges. It might have been better to combine the corridors with the open space between the north and south parts. The reason why it was done that way is that some offices are consuming the whole space of the CWO-split, including the corridor part. Thus, there is no corridor or ballustrade. This brings up two important aspects of modeling buildings with shape grammars. First, in many situations a decision has to be made which way to split first. Relations between structures are often not obvious, and it takes some time to recognize them. This leads directly to the second aspect, namely refactoring. Although it might require much effort when realizing that a decision was wrong, it is worthwhile to refactor the code because the creation of further parts of the building becomes much easier and faster. 5.2. Dealing with details In most buildings one has to face irregularities. This applies also to the chosen university complex. When taking a closer look at the floorplan in Fig. 13(a), one can see that the bridge segment between segments C and D is discontinuous. In building 16c, it fits the scheme, however, it only connects to building 16 and is continued with an offset on the inside. Thus, the modeling of this part had to be treated as an exception. Specifically, there is no bridge segment between segments C and D in building 16. 1 2 3 4 5 6 7 CWO− split E3Office Void 1.0 0.5 1.0 0.5 E3DoubleWindow Office − 1 1.0 − 1 E2Door E1Corridor Void Void Fig. 16. GML shape grammar code of the corridor-wall-office-split (CWO-split) and the according model. The parts of the bridge segment are simply included in segment D. This is the bridge inside building 16, the access corridors to the staircase in the upper floors and the doors on the ground floor. The staircases are another exceptional detail. First, one would like to define the size of a step and then fit exactly as many steps into the staircase as needed to reach the next floor. Dynamic staircases are necessary if staircases need to adapt themselves to different heights of floors. This is the key idea of procedural modeling. The problem is that the steps reach into the floor above, i.e., the containment property is violated. Thus, the scope of the staircase is extruded to connect seamlessly to the floor assembly of the floor above. Furthermore, the steps were split with an absolute subdivide as well as the grating of the bridges outside. Besides that, only the split operator was used in the whole modeling process. The case of the cavities in the buildings, which serve as bike parking, also had to be treated specifically. The problem was that in the segments with cavities the floor is interrupted. Thus, the floor assembly could not be split off before, but only after the cavity was split off. Another interesting detail is the mirrored structure of the north and south parts in the buildings. As the described CWO-split performes a split in y-direction, it can only be applied on the south side. The problem could be solved by implementing a second CWO-split in reversed order. However, this is not the only rule that depends on the Y-direction and is used on both sides. Thus, it is desirable to find a solution to be able to apply these splits on both sides. The problem was solved by defining a special attribute yDir that is defined when splitting the segment into north, open space and south. For the north part it is defined as the negative =Y direction and for the south part simply as =Y direction. Splits that are applied to north and south parts of the buildings now simply use yDir as split direction instead of =Y. 5.3. Solving a parameter dependency problem There are cases where the top-down structure of the building leads to an inappropriate parametrization: Parameters, e.g., distances, that are convenient to measure might not be parameters of the model, which requires re-parametrization. This is in fact a common fundamental problem of all procedural or even just structured modeling approaches: A dependency between different levels, predecessors or branches in the evaluation tree is hard to resolve with depth-first traversal. Our solution to this problem is to propagate not a value, but a function along the evaluation tree: The function performs the reparametrization lazily, i.e., only when it is needed. The function is evaluated in the local coordinate frame, but it may carry a reference to a different coordinate frame; so it can establish a link between both frames. A concrete instance of this problem is depicted in Fig. 19: The room rule requires a position for the door as the distance from the wall of the room (blue arrows in the middle). Since it was not Fig. 17. Building segments, ambient occlusion rendering. B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 331 Fig. 18. The semantically enriched four-building university complex. The blue wireframe boxes are offices, the turquoise boxes are corridors, yellow parts are staircases, whereas red parts are entrance areas. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Fig. 19. Re-parametrization problem for the door positions. The room widths are given, the door position is needed with respect to a wall of the room, but it is measured with respect to the origin of the section. Solved using lazy evaluation. possible to enter every room, the position of the door was measured instead from the beginning of the corridor segment (section boundary), as indicated by the black arrows on the bottom. The solution to the problem was to store the corridor scope in a variable when it is available, and to re-use it when it is needed. So assume a door was measured 18.208 m from the sector origin: =my-corridor current-scope def y (Definition of many rooms in betweeny) CWO-split E3Office Void 1:0 0:5 1:0 0:5 E3DoubleWindow Office mycorridor =X 18:208 reldist 1:01 E2Door E1Corridor Void Void Fortunately, the solution is simple to use, but in fact it is a bit tricky. The reldist function pops a scope, a direction and a measurement with respect to this scope, and it generates a function, which is pushed on the stack. The function is given instead of a number to E2Door. So the execution of reldist has the effect that E2Door is called as follows: f my-corridor=X 18:208 reldist-internal g 1:01 E2Door The E2Door rule passes all three input directly on to the split function, i.e., it executes the following split: ½ f my-corridor=X 18:208 reldist-internal g 1:01  =X split The split function performs an exec on its input values instead of using them directly. Typically, these values are numbers that exec just pushes back on the stack. But if a value is a function, exec evaluates it. The effect of reldist-internal is to add the measurement to the signed distance of the scope origins, resulting in the desired transformed measurement: usereg !dist !dir !scope : dir =X eq f : scope begin p000 getX end p000 getX sub : dist add g if y (similar if-clauses for the other directionsy) This solves the problem that the measurement can be done when the wall scope has become the current scope. The solution uses lazy evaluation, which is elegantly accomplished using a generated function. 5.4. Semantic enrichment The fact that each scope contains a whole dictstack allows for interesting semantic applications. The dictstack of a typical scope in Table 2. Recall that the scope-copy function from Fig. 8 creates an empty new dictionary and pushes to it a copy of the parent dictstack. Since scope-copy is used in any split or subdivide, the dictstack of a scope contains the whole refinement history. Name lookup in a dictstack works in such a way that new entries ‘‘shadow’’ old entries, but the old entries are still present (and accessible), again shown in Table 2. The green box is not a terminal but an intermediate symbol. In fact, all 35k scopes created during the refinement are contained in one large scopearray that can, e.g., be filtered (Fig. 20): scope-array f begin segmentID =E eq compute-volume 10:0 gt and end g filter f 0.1 2 highlight-scope g forall The filter operator expects an array and a function to execute on each element that pushes 0 or 1 like for if. The compute-volume function simply multiplies the extents in all directions. So using these few lines of code, all scopes in segment E are highlighted that are greater than (gt) 10 m3. Similarly, all scopes that are of a specific type can be filtered and highlighted: scope-array f begin elementType=E2Door eq end g filter f 0:05 3 highlight-scope g forall This was accomplished by adding to each rule one code line that defines as =elementType the name of the rule. In this case, also all sub-scopes of the actual door are highlighted, since they inherit that label. It is, of course, also possible to distinguish between a door scope and a descendant of a door. It should be noted that since all scopes are preserved, it is also possible to attach labels to scopes that are subsequently mapped 332 B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 Table 2 Dictionary stack of the green scope. /dy /self /level /elementType /dx /self /level /elementType /dz /self /level /dx /self /level /dz /self /level /dy /self /level /dx /self /level /building /segmentID /ex /ey /ez /dx /dy /dz /mat /self /pt /level /elementType /segmentID (4.85,6.3) (Dictstack:20078,20078) 7 /E1Wall (82.51,82.81) (Dictstack:20002,20002) 6 /ELevel (0.35,3.5) (Dictstack:19999,19999) 5 (78.1,82.81) (Dictstack:19996,19996) 4 (0,3.5) (Dictstack:19824,19824) 3 (0,6.3) (Dictstack:18848,18848) 2 (78.1,93.23) (Dictstack:9419,9419) 1 /Inffeldgasse-16c /C (1,0,0) (0,1,0) (0,0,1) (0,93.23) (0,15.6) (0,11.1) 16 (Dictstack:9412,9412) (0,0,0) 0 /None /None Since name lookup proceeds from top to bottom, /dy defined in level 9 overwrites the earlier values. Annotations added at any level are still available as the dictstack contains the whole refinement history. The resulting /segmentID is /C (overridden in level 1), /elementType is /E1Wall (set in level 7). The green box is split another two times, its white descendant is a terminal box on level 9. to terminal-void, i.e., to empty space. Furthermore, this structure can be directly mapped to a hierarchical spatial data structure for fast point containment queries. 6. Conclusion and future work The reconstruction of this particular university complex was of course a very specific use case, but it still teaches us fundamental techniques applicable in most cases. The first lesson is that it is absolutely vital to identify super-structures and to subsequently divide the building into manageable smaller parts. Then guiding split planes need to be chosen according to structures within these parts, such as subsection divisions, e.g., fire walls or bearing walls. These guiding structures also need to be chosen in such a way that measurements can be taken conveniently, otherwise too many re-parametrizations are necessary. Only after a good vertical division into sectors it is found that the building should be split into its floors. The room subdivision is very individual. In general it appears to be necessary to recognize regularities, and to try to abstract them into parametrizable rules. However, over-generalization can be a problem as well: we chose to have several types of door-wall combinations instead of a single one with a huge number of parameters. It should be noted that with increasing complexity, refactoring becomes ever more important when modeling with shape grammars to obtain a semantically meaningful structure. The modeling process exhibits great similarities to a software design process. Our experience was that creating the first sector of the Fig. 21. The corridor skeleton of the building complex. Fig. 20. Semantic filtering to highlight all scopes in segment E with a volume larger than 10 m3 (cyan), and all scopes of door type /E2Door (yellow). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 first building took almost two weeks, until after several rounds of refactoring we had a convenient set of rules. Completing the first building took another week, but then the process got considerably faster. Creating the fourth building, 16a, took merely just one day. 333 Generating view through semantic enrichment: We found a great advantage of the procedural approach is the possibility of semantic enrichment of the geometry. We had several requests to generate specific views of the building, which we could in most Fig. 22. Assigning a different height to the university complex creates a skyscraper by repeating the first floors as often as it fits in the building volume. Fig. 23. A surveillance application. Video streams from surveillance cameras are analyzed using a person detector. Both the video streams and the detection results are projected into the 3D model, detected persons are highlighted using billboards and auxiliary geometry. 334 B. Hohmann et al. / Computers & Graphics 34 (2010) 322–334 cases realize by adding attributes to the scopes and filtering out the appropriate entities. The whole building complex is composed of 14 104 boxes, and all in all 35 215 scopes are produced during grammar evaluation. Since we keep all the scopes, we can on the fly highlight different parts of the model, as shown in Figs. 12, 14 and 17. Furthermore, by varying terminal rules we could distinguish offices, corridors, entrances, bridges, and staircases from walls. The different parts can easily be filtered out and highlighted. In particular, we have full information also about the empty spaces (cf. Figs. 21, 18). Generalizability can be difficult to obtain: We have done some experiments to generate variants of the building, to explore the design space. It is quite obvious that the building follows a generalizable pattern; however, most of our randomly generated variants such as the skyscraper in Fig. 22 did exhibit certain flaws. This indicates that we have not yet identified rules that would guarantee that only valid buildings can be generated. Apparently there are some important functional relations that our rules do not cover. Drawbacks of box geometry: One of the obvious drawbacks of our approach is that the building is composed only of unconnected boxes. This makes surface-based operations difficult, e.g., to collect all faces that make up a corridor wall, or to compute all surfaces visible in a particular room; this would be much easier with a connected mesh. However, we found that this was in fact not much of an issue so far. Another more serious limitation is that it is really a mess to realize anything non-rectangular with boxes, e.g., the staircase railings. Although our blocks are always rectangular, we can realize, to some degree, also non-rectangular floor layouts with angles that are not only right angles. Scopes have a local coordinate system which could be used for rotations. However, we have not used this feature so far. Facility surveillance application: As stated in the Introduction, the initial motivation for creating a 3D model of the facility was a surveillance project. We have indeed successfully managed to project live images from surveillance cameras into the model, including cut-out billboards from a vision-based person tracker. We found that it is much easier to get a coherent picture of the surveillance situation when the information is not shown in individual camera images, but in a coherent 3D space (see Fig. 23). 6.1. Future work The most important research question is how to generalize the grammar system to obtain a solution that is applicable also to less rectangularly structured buildings. We see great potential in the overall approach since there is apparently a great need for sustainable detailed models complex facilities. But we definitely have to abandon the pure box-based approach and must integrate more flexible geometric primitives, which we would still like to process in our procedural grammar-like fashion that has proven robust and efficient. The second focus will be on research for more interactive ways of authoring than by scripting, which is too indirect. To abandon all scripting seems unrealistic, though, since code refactoring is apparently absolutely vital when the models become very complex. Suitable metaphors for ‘‘interactive refactoring’’ currently seem to be inaccessible. And finally, since the Disto has a Bluetooth interface we would like to explore the possibility of porting the software to a mobile platform to take measurements on site, and to enter them directly into the right position in the code. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.cag.2010.05.007. References [1] Microsoft, Bing Maps. URL: /http://www.bing.com/mapsS, 2010. [2] International Alliance for Interoperability. Industry Foundation Classes. URL: /http://www.ifcwiki.org/S, 2010. [3] Stiny G, Gips J. Shape grammars and the generative specification of painting and sculpture. In: The best computer papers of 1971. Auerbach; 1972. p. 125–35. [4] Prusinkiewicz P, Lindenmayer A. The algorithmic beauty of plants. SpringerVerlag; 1990. [5] Wonka P, Wimmer M, Sillion F, Ribarsky W. Instant architecture. In: Proceedings of the SIGGRAPH 2003, 2003. p. 669–77. [6] Müller P, Wonka P, Haegler S, Ulmer A, Gool LV. Procedural modeling of buildings. In: ACM SIGGRAPH, vol. 25, 2006. p. 614–23. [7] Procedural Inc. CityEngine. URL: /http://www.procedural.com/S, 2009. [8] Müller P, Zeng G, Wonka P, Gool LV. Image-based procedural modeling of facades. In: ACM SIGGRAPH, vol. 26, 2007. p. 85. [9] Hohmann B, Krispel U, Havemann S, Fellner D. 2009. Cityfit—high-quality urban reconstructions by fitting shape grammars to images and derived textured point clouds. In: 3D-ARCH09. [10] Koutsourakis P, Simon L, Teboul O, Tziritas G, Paragios N. Single view reconstruction using shape grammars for urban environments. In: ICCV09, 2009. [11] Xiao J, Fang T, Zhao P, Lhuillier M, Quan L. Image-based street-side city modeling. In: SIGGRAPH Asia 09. New York, NY, USA: ACM; 2009. p. 1–12. [12] Lipp M, Wonka P, Wimmer M. Interactive visual editing of grammars for procedural architecture. In: Proceedings of the ACM SIGGRAPH 2008, vol. 2008. p. 1–10. [13] Larive M, Gaildrat V. Wall grammar for building generation. In: GRAPHITE 2006, 2006. p. 429–37. [14] Alexander C. A city is not a tree. In: Architectural Forum 122 (1) (part 1), (2) (part 2), 1965. p. 58–61, 58–62. [15] Finkenzeller D. Detailed building facades. IEEE Computer Graphics and Applications 2008;28(3):58–66. [16] Aliaga D, Rosen A, Bekins D. Style grammars for interactive visualization of architecture. In: IEEE transactions on visualization and computer graphics, 2007. p. 786–97. [17] Ripperda N, Berenner C. Application of a formal grammar to facade reconstruction in semiautomatic and automatic environments. In: AGILE09, 2009. [18] Havemann S. Generative mesh modeling. PhD thesis, Technical University Braunschweig; 2005. [19] Adobe Inc. PostScript language reference manual, 3rd ed. Addison-Wesley; 1999. [20] CGV TU Graz. GML Homepage. /http://www.generative-modeling.orgS, 2009.