Ch2 Modified

1
A SIMPLE SYNTAX-
DIRECTED TRANSLATOR
Chapter 2
2
Building a Simple Compiler

• Building our compiler involves:
– Defining the syntax of a programming language
– Develop a source code parser: for our compiler
we will use predictive parsing
– Implementing syntax directed translation to
generate intermediate code
3
Syntax Definition
• Context-free grammar is a 4-tuple with
– A set of tokens (terminal symbols)
– A set of nonterminals
– A set of productions
– A designated start symbol
4
Example Grammar
Context-free grammar for simple expressions:
G = <{list,digit}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, list>
with productions P =
list  list + digit
list  list - digit
list  digit
digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
5
Derivation
• Given a CF grammar we can determine the
set of all strings (sequences of tokens)
generated by the grammar using derivation
– We begin with the start symbol
– In each step, we replace one nonterminal in the
current sentential form with one of the right-
hand sides of a production for that nonterminal
6
Derivation for the Example

Grammar
list
 list + digit
 list - digit + digit
 digit - digit + digit
 9 - digit + digit
 9 - 5 + digit
9-5+2
This is an example leftmost derivation, because we replaced

the leftmost nonterminal (underlined) in each step.
7
Derivation for the Example

Rightmost Grammar
Likewise, a rightmost derivation replaces the rightmost
nonterminal in each step
list
P=  digit - list
list  digit + list  digit - digit + list
 digit - digit + digit
list  digit - list
 digit - digit + 2
list  digit  digit - 5 + 2
digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 9-5+2
8
Parse Trees
• The root of the tree is labeled by the start symbol
• Each leaf of the tree is labeled by a terminal
(=token) or 
• Each interior node is labeled by a nonterminal
• If A  X1 X2 … Xn is a production, then node A has
immediate children X1, X2, …, Xn where Xi is a
(non)terminal or  ( denotes the empty string)
9
Parse Tree for the Example

Grammar
Parse tree of the string 9-5+2 using grammar G
list
list digit
list digit
digit
The sequence of
9 - 5 + 2 leafs is called the
yield of the parse tree
The Two Derivations for x – 2 * y
Rule Sentential Form Rule Sentential Form
— Expr — Expr
1 Expr Op Expr 1 Expr Op Expr
3 <id,x> Op Expr 3 Expr Op <id,y>
5 <id,x> – Expr 6 Expr * <id,y>
1 <id,x> – Expr Op Expr 1 Expr Op Expr * <id,y>
2 <id,x> – <num,2> Op Expr 2 Expr Op <num,2> * <id,y>
6 <id,x> – <num,2> * Expr 5 Expr – <num,2> * <id,y>
3 <id,x> – <num,2> * <id,y> 3 <id,x> – <num,2> * <id,y>
Leftmost derivation Rightmost derivation
In both cases, Expr * id – num * id

• The two derivations produce different parse trees
• The parse trees imply different evaluation orders!
Derivations and Parse Trees
G
Leftmost derivation
Rule Sentential Form
— Expr E
1 Expr Op Expr
3 <id,x> Op Expr
5 <id,x> – Expr E Op E
1 <id,x> – Expr Op Expr
2 <id,x> – <num,2> Op Expr
6 <id,x> – <num,2> * Expr x – Op E
E
3 <id,x> – <num,2> * <id,y>
This evaluates as x – ( 2 * y ) 2 y
*
Derivations and Parse Trees
G
Rightmost derivation
Rule Sentential Form
— Expr E
1 Expr Op Expr
3 Expr Op <id,y>
6 Expr * <id,y> E Op E
1 Expr Op Expr * <id,y>
2 Expr Op <num,2> * <id,y>
5 Expr – <num,2> * <id,y> E Op E * y
3 <id,x> – <num,2> * <id,y>
x – 2
This evaluates as ( x – 2 ) * y
13
Ambiguity
Consider the following context-free grammar:
G = <{string}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, string>
with production P =
string  string + string | string - string | 0 | 1 | … | 9

This grammar is ambiguous, because more than one parse tree
represents the string 9-5+2
14
Ambiguity (cont’d)
string string
string string string
string string string string string
9 - 5 + 2 9 - 5 + 2
15
Associativity of Operators
Left-associative operators have left-recursive productions
left  left + term | term
String a+b+c has the same meaning as (a+b)+c
Right-associative operators have right-recursive productions
right  term = right | term
String a=b=c has the same meaning as a=(b=c)
Operators on the same line have the same associativity and
precedence:
left-associative: + -
left-associative: */
16
Precedence of Operators
Operators with higher precedence “bind more tightly”
expr  expr + term | expr – term | term
term  term * factor | term / factor | factor
factor  number | ( expr )
number  0|1|2|…….|9
String 2+3*5 has the same meaning as 2+(3*5)
expr
expr term
term term factor
factor factor number
number number
2 + 3 * 5
17
Syntax of Statements
stmt  id := expr
| if expr then stmt
| if expr then stmt else stmt
| while expr do stmt
| begin opt_stmts end
opt_stmts  stmt ; opt_stmts
|
18
Syntax-Directed Translation
• Uses a CF grammar to specify the syntactic
structure of the language
• AND associates a set of attributes with the
terminals and nonterminals of the grammar
• AND associates with each production a set of
semantic rules to compute values of attributes
• A parse tree is traversed and semantic rules
applied: after the tree traversal(s) are completed,
the attribute values on the nonterminals contain
the translated form of the input
19
Synthesized and Inherited

Attributes
• An attribute is said to be …
– synthesized if its value at a parse-tree node is
determined from the attribute values at the children of
the node
– Suppose a node N in a parse tree is labeled by the
grammar symbol X . We write X.a to denote the value
of attribute a of X at that node.
– inherited if its value at a parse-tree node is determined
by the parent (by enforcing the parent’s semantic rules)
20
Example Attribute Grammar

Syntax-directed definition for infix to postfix translation
String concat operator

Production Semantic Rule
expr  expr1 + term expr.t := expr1.t // term.t // “+”
expr  expr1 - term expr.t := expr1.t // term.t // “-”
expr  term expr.t := term.t
term  0 term.t := “0”
… …
21
Example Annotated Parse Tree
expr.t = “95-2+”
expr.t = “95-” term.t = “2”
expr.t = “9” term.t = “5”
term.t = “9”
9 - 5 + 2
22
Depth-First Traversals
procedure visit(n : node);
begin
for each child m of n, from left to right do
visit(m);
evaluate semantic rules at node n
end
23
Depth-First Traversals (Example)
expr.t = “95-2+”
expr.t = “95-” term.t = “2”
expr.t = “9” term.t = “5”
term.t = “9”
9 - 5 + 2 Note: all attributes are

of the synthesized type
24
Translation Schemes
• A translation scheme is a CF grammar embedded
with semantic actions
• When drawing a parse tree for a translation scheme,
we indicate an action by constructing an extra child
for it, connected by a dashed line to the node that
corresponds to the head of the production.
rest  + term { print(“+”) } rest

rest
Embedded
semantic action
+ term { print(“+”) } rest
25
Example Translation Scheme
expr  expr + term { print(“+”) }

expr  expr - term { print(“-”) }
expr  term
term  0 { print(“0”) }
term  1 { print(“1”) }
… …
term  9 { print(“9”) }
26
Example Translation Scheme

(cont’d)
expr
{ print(“+”) }
expr + term
{ print(“2”) }
{ print(“-”) }
- term 2
expr
{ print(“5”) }
term 5
{ print(“9”) }
9
Translates 9-5+2 into postfix 95-2+
27
Parsing
• Parsing = process of determining if a string of
tokens can be generated by a grammar
• For any CF grammar there is a parser that takes at
most O(n3) time to parse a string of n tokens
• Top-down parsing “constructs” a parse tree from
root to leaves
• Bottom-up parsing “constructs” a parse tree from
leaves to root
28
Predictive Parsing
• Recursive descent parsing is a top-down parsing
method
– Each nonterminal has one (recursive) procedure that is
responsible for parsing the nonterminal’s syntactic
category of input tokens
– When a nonterminal has multiple productions, each
production is implemented in a branch of a selection
statement based on input look-ahead information
• Predictive parsing is a special form of recursive
descent parsing where we use one lookahead
token to unambiguously determine the parse
operations
29
Example Predictive Parser

(Grammar)
type  simple
| ^ id
| array [ simple ] of type
simple  integer
| char
| num dotdot num
30

(Execution Step 1)
Check lookahead
type()
and call match
match(‘array’)
Input: array [ num dotdot num ] of integer
lookahead
31

(Execution Step 2)
type()
match(‘array’) match(‘[’)
lookahead
32

(Execution Step 3)
type()
match(‘array’) match(‘[’) simple()
match(‘num’)
lookahead
33

(Execution Step 4)
type()
match(‘num’) match(‘dotdot’)
lookahead
34

(Execution Step 5)
type()
match(‘num’) match(‘dotdot’) match(‘num’)
lookahead
35

(Execution Step 6)
type()
match(‘array’) match(‘[’) simple() match(‘]’)
lookahead
36

(Execution Step 7)
type()
match(‘array’) match(‘[’) simple() match(‘]’) match(‘of’)
lookahead
37

(Execution Step 8)
type()
match(‘array’) match(‘[’) simple() match(‘]’) match(‘of’) type()
match(‘num’) match(‘dotdot’) match(‘num’) simple()
match(‘integer’)
lookahead
38
Adding a Lexical Analyzer

• Typical tasks of the lexical analyzer:
– Remove white space and comments
– Encode constants as tokens
– Recognize keywords
– Recognize identifiers and store identifier names
in a global symbol table
39
The Lexical Analyzer “lexer”
Lexical analyzer
y := 31 + 28*x
lexan()
<id, “y”> <assign, > <num, 31> <‘+’, > <num, 28> <‘*’, > <id, “x”>
token
(lookahead)
tokenval Parser
(token attribute) parse()

Ch2 Modified

Uploaded by

Copyright:

Available Formats

Ch2 Modified

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ch2 Modified

Uploaded by

Copyright:

Available Formats

1

Building a Simple Compiler

Context-free grammar for simple expressions:

G = <{list,digit}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, list>

list  list + digit

list  list - digit

Derivation for the Example

This is an example leftmost derivation, because we replaced

Derivation for the Example

Parse Tree for the Example

Leftmost derivation Rightmost derivation

In both cases, Expr * id – num * id

Consider the following context-free grammar:

G = <{string}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, string>

string  string + string | string - string | 0 | 1 | … | 9

string string string

string string string string string

Synthesized and Inherited

Example Attribute Grammar

String concat operator

Example Annotated Parse Tree

expr.t = “95-” term.t = “2”

expr.t = “9” term.t = “5”

Depth-First Traversals (Example)

expr.t = “95-” term.t = “2”

expr.t = “9” term.t = “5”

9 - 5 + 2 Note: all attributes are

rest  + term { print(“+”) } rest

Example Translation Scheme

expr  expr + term { print(“+”) }

Example Translation Scheme

Example Predictive Parser

Example Predictive Parser

Input: array [ num dotdot num ] of integer

Example Predictive Parser

Input: array [ num dotdot num ] of integer

Example Predictive Parser

match(‘array’) match(‘[’) simple()

Input: array [ num dotdot num ] of integer

Example Predictive Parser

match(‘array’) match(‘[’) simple()

Input: array [ num dotdot num ] of integer

Example Predictive Parser

match(‘array’) match(‘[’) simple()

match(‘num’) match(‘dotdot’) match(‘num’)

Input: array [ num dotdot num ] of integer

Example Predictive Parser

match(‘array’) match(‘[’) simple() match(‘]’)

match(‘num’) match(‘dotdot’) match(‘num’)

Input: array [ num dotdot num ] of integer

Example Predictive Parser

match(‘array’) match(‘[’) simple() match(‘]’) match(‘of’)

match(‘num’) match(‘dotdot’) match(‘num’)

Input: array [ num dotdot num ] of integer

Example Predictive Parser

match(‘array’) match(‘[’) simple() match(‘]’) match(‘of’) type()

match(‘num’) match(‘dotdot’) match(‘num’) simple()

Adding a Lexical Analyzer

The Lexical Analyzer “lexer”