Syntax Directed
Syntax Directed
Syntax Directed
Grammar symbols are associated with attributes to associate information with the programming language constructs that they represent. Values of these attributes are evaluated by the semantic rules associated with the production rules. Evaluation of these semantic rules:
may generate intermediate codes may put information into the symbol table may perform type checking may issue error messages may perform some other activities in fact, they may perform almost any activities.
Translation Schemes:
indicate the order of evaluation of semantic actions associated with a production rule. In other words, translation schemes give a little bit information about implementation details.
Syntax-Directed Definitions
A syntax-directed definition is a generalization of a context-free grammar in which:
Each grammar symbol is associated with a set of attributes. This set of attributes for a grammar symbol is partitioned into two subsets called synthesized and inherited attributes of that grammar symbol. Each production rule is associated with a set of semantic rules.
Semantic rules set up dependencies between attributes which can be represented by a dependency graph. This dependency graph determines the evaluation order of these semantic rules. Evaluation of a semantic rule defines the value of an attribute. But a semantic rule may also have some side effects such as printing a value.
CS416 Compiler Design 3
Syntax-Directed Definition
In a syntax-directed definition, each production A is associated with a set of semantic rules of the form: b=f(c1,c2,,cn) where f is a function, and b can be one of the followings: b is a synthesized attribute of A and c1,c2,,cn are attributes of the grammar symbols in the production ( A ). OR b is an inherited attribute one of the grammar symbols in (on the right side of the production), and c1,c2,,cn are attributes of the grammar symbols in the production ( A ).
Attribute Grammar
So, a semantic rule b=f(c1,c2,,cn) indicates that the attribute b depends on attributes c1,c2,,cn. In a syntax-directed definition, a semantic rule may just evaluate a value of an attribute or it may have some side effects such as printing values. An attribute grammar is a syntax-directed definition in which the functions in the semantic rules cannot have side effects (they can only evaluate values of attributes).
Semantic Rules
print(E.val) E.val = E1.val + T.val E.val = T.val T.val = T1.val * F.val T.val = F.val F.val = E.val F.val = digit.lexval
Symbols E, T, and F are associated with a synthesized attribute val. The token digit has a synthesized attribute lexval (it is assumed that it is evaluated by the lexical analyzer).
CS416 Compiler Design 7
E.val=17
E.val=5 T.val=5 F.val=5 digit.lexval=5 + T.val=3 F.val=3
return
T.val=12 * F.val=4 digit.lexval=4
digit.lexval=3
Dependency Graph
Input: 5+3*4 L
E.val=17
E.val=5 T.val=5 F.val=5 digit.lexval=5 T.val=3 F.val=3 digit.lexval=3 T.val=12 F.val=4 digit.lexval=4
Semantic Rules
E.loc=newtemp(), E.code = E1.code || T.code || add E1.loc,T.loc,E.loc E.loc = T.loc, E.code=T.code T.loc=newtemp(), T.code = T1.code || F.code || mult T1.loc,F.loc,T.loc T.loc = F.loc, T.code=F.code F.loc = E.loc, F.code=E.code F.loc = id.name, F.code=
Symbols E, T, and F are associated with synthesized attributes loc and code. The token id has a synthesized attribute name (it is assumed that it is evaluated by the lexical analyzer). It is assumed that || is the string concatenation operator.
CS416 Compiler Design 10
Semantic Rules
L.in = T.type T.type = integer T.type = real L1.in = L.in, addtype(id.entry,L.in) addtype(id.entry,L.in)
Symbol T is associated with a synthesized attribute type. Symbol L is associated with an inherited attribute in.
11
D
T real L id parse tree L id T.type=real
L.in=real
L1.in=real addtype(q,real) id.entry=q
12
S-Attributed Definitions
Syntax-directed definitions are used to specify syntax-directed translations. To create a translator for an arbitrary syntax-directed definition can be difficult. We would like to evaluate the semantic rules during parsing (i.e. in a single pass, we will parse and we will also evaluate semantic rules during the parsing). We will look at two sub-classes of the syntax-directed definitions:
S-Attributed Definitions: only synthesized attributes used in the syntax-directed definitions. L-Attributed Definitions: in addition to synthesized attributes, we may also use inherited attributes in a restricted fashion.
To implement S-Attributed Definitions and L-Attributed Definitions are easy (we can evaluate semantic rules in a single pass during the parsing). Implementations of S-attributed Definitions are a little bit easier than implementations of L-Attributed Definitions
13
A XYZ
stack parallel-stack
top Z
Y
X .
Z.z
Y.y
X.x .
CS416 Compiler Design
top A .
A.a .
14
Semantic Rules
print(val[top-1]) val[ntop] = val[top-2] + val[top] val[ntop] = val[top-2] * val[top] val[ntop] = val[top-1]
At each shift of digit, we also push digit.lexval into val-stack. At all other shifts, we do not put anything into val-stack because other terminals do not have attributes (but we increment the stack pointer for val-stack).
CS416 Compiler Design 15
. . . . . . . .
I1: LL
I2: L E r E E +T I3: E T T T *F
I4: T F I5: F E E T T F F
I6: F d
. . . . . . . . . . . . . .
r +
I7: L Er
. . . . . .
T F ( d
I11: E E+T T T *F
4
. . .
5
6
I9: T T* F F (E) F d
I10: F (E ) E E +T
. . . . .
F
( d ) + 8
I12: T T*F
5 6
I13: F (E)
.
16
val-stack
5 5 5 5 55-3 5-3 5-3 5-35-3-4 5-3-4 5-12 17 1717
input
5+3*4r +3*4r +3*4r +3*4r +3*4r 3*4r *4r *4r *4r 4r r r r r $ $
action
s6 Fd TF ET s8 s6 Fd TF s9 s6 Fd TT*F EE+T s7 LEr acc
semantic rule
d.lexval(5) into val-stack F.val=d.lexval do nothing T.val=F.val do nothing E.val=T.val do nothing push empty slot into val-stack d.lexval(3) into val-stack F.val=d.lexval do nothing T.val=F.val do nothing push empty slot into val-stack d.lexval(4) into val-stack F.val=d.lexval do nothing T.val=T1.val*F.val E.val=E1.val*T.val push empty slot into val-stack print(17), pop empty slot from val-stack
17
18
AB
B0B B1B B
19
20
L-Attributed Definitions
S-Attributed Definitions can be efficiently implemented. We are looking for a larger (larger than S-Attributed Definitions) subset of syntax-directed definitions which can be efficiently evaluated. L-Attributed Definitions L-Attributed Definitions can always be evaluated by the depth first visit of the parse tree. This means that they can also be evaluated during the parsing.
21
L-Attributed Definitions
A syntax-directed definition is L-attributed if each inherited attribute of Xj, where 1jn, on the right side of A X1X2...Xn depends only on: 1. The attributes of the symbols X1,...,Xj-1 to the left of Xj in the production and 2. the inherited attribute of A Every S-attributed definition is L-attributed, the restrictions only apply to the inherited attributes (not to synthesized attributes).
22
This syntax-directed definition is not L-attributed because the semantic rule Q.in=q(R.s) violates the restrictions of L-attributed definitions. When Q.in must be evaluated before we enter to Q because it is an inherited attribute. But the value of Q.in depends on R.s which will be available after we return from R. So, we are not be able to evaluate the value of Q.in before we enter to Q.
23
Translation Schemes
In a syntax-directed definition, we do not say anything about the evaluation times of the semantic rules (when the semantic rules associated with a production should be evaluated?). A translation scheme is a context-free grammar in which: attributes are associated with the grammar symbols and semantic actions enclosed between braces {} are inserted within the right sides of productions.
Ex:
Translation Schemes
When designing a translation scheme, some restrictions should be observed to ensure that an attribute value is available when a semantic action refers to that attribute. These restrictions (motivated by L-attributed definitions) ensure that a semantic action does not refer to an attribute that has not yet computed. In translation schemes, we use semantic action terminology instead of semantic rule terminology used in syntax-directed definitions. The position of the semantic action on the right side indicates when that semantic action will be evaluated.
25
Production E E1 + T
a production of
a syntax directed definition
T
id {print(a)} +
R
T {print(+)} R + T {print(+)} R
id {print(b)}
id {print(c)}
The depth first traversal of the parse tree (executing the semantic actions in that order) will produce the postfix representation of the infix expression.
CS416 Compiler Design 28
With a L-attributed syntax-directed definition, it is always possible to construct a corresponding translation scheme which satisfies these three conditions (This may not be possible for a general syntax-directed translation).
CS416 Compiler Design 29
Top-Down Translation
We will look at the implementation of L-attributed definitions during predictive parsing. Instead of the syntax-directed translations, we will work with translation schemes. We will see how to evaluate inherited attributes (in L-attributed definitions) during recursive predictive parsing. We will also look at what happens to attributes during the left-recursion elimination in the left-recursive grammars.
30
31
E E1 + T { E.val = E1.val + T.val } E E1 - T { E.val = E1.val - T.val } ET { E.val = T.val } T T1 * F { T.val = T1.val * F.val } TF { T.val = F.val } F ( E ) { F.val = E.val } F digit { F.val = digit.lexval }
When we eliminate the left recursion from the grammar (to get a suitable grammar for the top-down parsing) we also have to change semantic actions
CS416 Compiler Design 33
E T { A.in=T.val } A { E.val=A.syn } A + T { A1.in=A.in+T.val } A1 { A.syn = A1.syn} A - T { A1.in=A.in-T.val } A1 { A.syn = A1.syn} A { A.syn = A.in } T F { B.in=F.val } B { T.val=B.syn } B * F { B1.in=B.in*F.val } B1 { B.syn = B1.syn} B { B.syn = B.in } F ( E ) { F.val = E.val } F digit { F.val = digit.lexval }
CS416 Compiler Design 34
Evaluating attributes
A parse tree of left recursive grammar
Y A.a=g(f(X.x),Y.y)
parse tree of non-left-recursive grammar A
X X.x=f(X.x)
37
38
40
41
1.
2. 3. 4.
42
43
46
S M1 A M1 A M2 B M3 C M2 M3 Bb Cc
Evaluation of Attributes
S S.s=k(1,h(..)) A.i=1 A A.s=h(1,f(1),m(..),g(..),n(..)) B.i=f(1) B B.s=m(f(1),b.s) C.i=g(1,f(1),m(..)) C C.s=n(g(..),c.s)
48
Evaluation of Attributes
stack input s-attribute stack bc$ M1 bc$ 1 M1 M 2 bc$ 1 f(1) M1 M 2 b c$ 1 f(1) b.s M1 M 2 B c$ 1 f(1) m(f(1),b.s) M1 M2 B M3 c$ 1 f(1) m(f(1),b.s) g(1,f(1),m(f(1),b.s)) M1 M2 B M3 c $ 1 f(1) m(f(1),b.s) g(1,f(1),m(f(1),b.s)) c.s M1 M2 B M3 C $ 1 f(1) m(f(1),b.s) g(1,f(1),m(f(1),b.s)) n(g(..),c.s) M1 A $ 1 h(f(1),m(..),g(..),n(..)) S $ k(1,h(..))
CS416 Compiler Design 49
Problems
All L-attributed definitions based on LR grammars cannot be evaluated during bottom-up parsing. S { L.i=0 } L
L { L1.i=L.i+1 } L1 1 L { print(L.i) } this translations scheme cannot be implemented during the bottom-up parsing
S M1 L L M2 L1 1 But since L will be reduced first by the bottom-up L { print(s[top]) } parser, the translator cannot know the number of 1s. M1 { s[ntop]=0 } M2 { s[ntop]=s[top]+1 }
50
Problems
The modified grammar cannot be LR grammar anymore.
LLb La LMLb La M
NOT LR-grammar
. L . M L b, $ L . a, $ M .,a
S L, $
shift/reduce conflict
51