Chương 3. Phân Tích Cú Pháp
Chương 3. Phân Tích Cú Pháp
Chương 3. Phân Tích Cú Pháp
B1 B2 …….. Bn
Parse Trees and Derivations
• Note that leftmost derivations and rightmost derivations have the same
parse tree.
• Grammar
E E+E | E*E | -E | (E) | id
• String
id*id+id
• leftmost derivations: EE*Eid*Eid*E+Eid*id+Eid*id+id
• rightmost derivations: EE*EE*E+EE*E+idE*id+idid*id+id
Parse Trees and Derivations
• Grammar
E E+E | E*E | -E | (E) | id
• String E
-(id+id)
• Derivation - E
E
-E ( E )
- (E)
- (E+E) E + E
- (id+E)
- (id+id) id id
Parse Trees and Derivations
E
E
-E - E
- (E)
- (E+E) ( E )
- (id+E)
E + E
- (id+id)
id id
Ambiguity
• A grammar that produces more than one parse tree for some string is said to be
ambiguous.
• Equivalently, there is more than one leftmost derivations or rightmost
derivations for some string.
• Grammar
E E+E | E*E | -E | (E) | id
• String
id+id*id
• This string has two parse trees
Ambiguity
• This string (id+id*id) has two parse trees
E E
E + E E E+ E E E*E E * E
id+E E+E*E
id+E*E id+E*E
id E * E E + E id
id+id*E id+id*E
id+id*id id+id*id
id id id id
Exercises
Exercise 1: Consider the context-free grammar:
S SS+ | SS* | a
and the string aa+a*
a) Give a derivation for the string
b) Give a leftmost derivation for the string
c) Give a rightmost derivation for the string
d) Give a parse tree for the string
e) Is the grammar ambiguous or unambiguous? Justify your answer
f) Describe the language generated by this grammar
Exercises
Exercise 2: Repeat Exercise 1 for each of the following grammars and
strings:
a) S 0S1 | 01 with string 000111
b) S +SS | *SS | a with string +*aaa
c) S S(S)S | with string (()())
d) S S+S | SS | (S) | S* | a with string (a+a)*a
e) S (L) | a
L L,S | S with string ((a,a),a,(a))
f) S aSbS | bSaS | with string aabbab
Exercises
Exercise 3: Design grammars for the following languages:
a) The set of all strings of 0s and 1s that are palindromes; that is, the string
reads the same backward as forward.
• After lexical analysis: int7 ‘+’ ‘(‘ int4 ‘+’ int5 ‘)’
Example of parse tree
• During parsing we build a parse tree
E + E
Too much information
)
- Parentheses
int7 ( E
- Single-successor nodes
E + E
int4 int5
Example of abstract syntax tree
Parse Tree Abstract Syntax Tree AST is represented as
E + PLUS
E + E int7 + PLUS
E + E
int4 int5
Parse Tree vs Syntax Tree
Parse Tree Syntax Tree
A parse tree is a graphical representation of the A syntax tree is a condensed form of parse tree
replacement process in a derivation
Parse tree represent every detail from the real Syntax trees does not represent every detail from
syntax the real syntax (that’s why they are called
abstract).
For example: no rule nodes, no parentheses, …
Parse Tree vs Syntax Tree
• Consider the following grammar:
E E+T T+T F+T id+T
E E+T | T
id+T*F id+F*F id+id*F
T T*F | F id+id*id
F (E) | id E +
• For the string
id+id*id
E + T id *
• Genarate
T T F id
* id
Parse tree Syntax tree
F
F id
id id
Parsing
Parsing
Predictive parsing
LR(0) SLR(1) LALR CLR
LL parsing LL(1)
Top-Down Parsing
• Parsing is a process of deriving string from a given grammar.
• Top-down parsing is a process of constructing a parse tree for the input string:
• From the top
• From left to right
• Top-down parsing can be viewed as finding a leftmost derivation for an input
string
• Recursive Descent Parsing uses procedures for every terminal and nonterminal
entity.
• Recursive Descent Parsing to make a parse tree, which may or may not require
back-tracking
• A form of recursive descent parsing that does not require any back-tracking is
known as predictive parsing.
Top-Down Parsing
• Consider the grammar:
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id
and the input string: id+id*id
• Give a leftmost derivation for the input string
• Give a parse tree for the input string
Top-Down Parsing
• Consider the grammar:
• Give a leftmost derivation for the string:
E TE’
E TE’ FT’E’ idT’E’ idE’
E’ +TE’ | id+TE’ id+FT’E’ id+idT’E’
T FT’ id+id*FT’E’ id+id*idT’E’ E
id+id*idE’ id+id*id
T’ *FT’ |
F (E) | id T E’
and the string: id+id*id F +
T’ T E’
Give a parse tree for the string:
id F T’ id
id * F T’
id
Left-recursion
• A grammar is left-recursive if and only if there exists a nonterminal
symbol that can derive to a sentential form with itself as the leftmost
symbol.
+
where + indicates the operation of making one or more substitutions, and
is any sequence of terminal and nonterminal symbols.
• The nonterminal and its production are said to be left-recursive, because the
production has itself as the leftmost symbol on the right hand side.
exprexpr+termexpr+term+termexpr+term+term+termexpr+term+term+term+term
Removing left-recursion
• Left-recursion often poses problems for parsers
• Therefore, a grammar is often preprocessed to eliminate the left-recursion.
• The left-recursive grammar |
can rewrite using right-recursion
|
Removing left-recursion
• Example:
Consider the left-recursive grammar
S → a | ^(T)
T → T,S | S
Eliminate the left-recursion from the grammar.
Solution: We have immediate left recursion in T-productions.
• Comparing T → T,S | S with A → Aα | β where A = T, α =, S and β = S
• Complete grammar will be
S → a | ^(T)
T → ST′
T′→ ,ST′ | ε
Removing left-recursion
Solution
• Comparing E → E+T | T with A → Aα | β Example:
A = E, α = +T, β = T Consider the left-recursive grammar
1) A → Aα | β is changed to A → βA′ and A′ → αA′ | ε E → E+T | T
2) A → βA′ means E → TE′ T → T*F | F
3) A′ → αA′ | ε means E′ → +TE′ | ε F → (E) | id
• Comparing T → T∗F | F with A → Aα | β
4) A = T, α =∗F, β = F
5) A → βA′ means T → FT′
6) A → αA′ | ε means T′ →*FT′ | ε
• Production F → (E) | id does not have any left-recursion
• Combining productions 1, 2, 3, 4, 5, 6 we get
E → TE′
E′ → +TE′ | ε
T → FT′
T’ → *FT′ | ε
F → (E) | id
Removing left-recursion
Example:
• The left-recursive grammar |
can rewrite using right-recursion
|
or
|
Removing left-recursion
Example:
• Consider the left-recursive grammar:
S SabA | a
Aa|b
• Can rewrite as
S aS
S abAS |
Aa|b
Removing left-recursion
• In general 1 |…| n | 1 |…| m
• Can rewrite as
1 | … | m
1 | … | n |
Removing left-recursion
Example:
• Consider the left-recursive grammar:
S SbA | SaA | b | a
A a | b
• Can rewrite as
S bS | aS
S bAS | aAS |
Aa|b
General Left-Recursion
• The grammar
' |
Is it a left-recursive grammar?
FIRST(F)={(, id}
FIRST(T)={(, id}
FIRST(E)={(, id}
FIRST(E’)={+, ε}
FIRST(T’)={*, ε }
FIRST example
• Consider the grammar
E TF
F +E |
T intY | (E)
Y *T |
• Compute FIRST
FIRST example
• Consider the grammar
E TF Compute FIRST(X):
F +E | If X is a terminal: FIRST(X) = {X}
If X is a nonterminal:
T intY | (E) FIRST(X):
If X
Y *T | FIRST (+) = {+} If X Y1Y2…Yk and FIRST(Yi) for all 1≤ i ≤ k
b FIRST(X):
FIRST (*) ={*} If X Y1Y2…Yk and b FIRST(Yi) and FIRST(Yj) for all 1≤ j ≤ i-1
FIRST (() = {(}
FIRST ()) ={)}
FIRST (int) = {int}
A
id F T’ id
id * F T’
id
Example MATCHED STACK
E$
INPUT
(int)$
ACTION
E TF TF
F +E
T intY (E)
Y *T
Example
• Consider the grammar
E TF
F +E | E TF (E)F (TF)F (intYF)F
E
T intY | (E) (intF)F (int)F (int)
Y *T | T F
• Input: (int)
( )
• Give a parse tree for the string E
T F
Y
int
MATCHED STACK INPUT ACTION
Example E$ int+int$
TF$ int+int$ Output E TF
• Consider the grammar intYF$ int+int$ Output T intY
E TF int YF$ +int$ Match int
E TF TF
F +E
T intY (E)
Y *T
Example
• Consider the grammar
E TF E TF intYF intF
F +E | int+E int+TF E
T intY | (E) int+intYF int+intF
Y *T | int+int T F
• Input: int+int
int Y + E
• Give a parse tree for the string
T F
int
Y
LL(1) Parsing Table
• For each production in G do:
• Consider the grammar: S Sa | b • For each terminarl t FIRST() do M[A, t] =
• If FIRST(), for each tFOLLOW(A) do M[A, t] =
• The parsing table • If FIRST() and $FOLLOW(A) do M[A, $] =
a b $
S b
Sa
A EA’
If X is a terminal: FIRST(X) = {X}
If X is a nonterminal:
A’ aEA’ | FIRST(X):
If X
E bE’ If X Y1Y2…Yk and FIRST(Yi) for all 1≤ i ≤ k
b FIRST(X):
E’ *TE’ | If X Y1Y2…Yk and b FIRST(Yi) and FIRST(Yj) for all 1≤ j ≤ i-1
T aAb | c
Compute FOLLOW(X):
FIRST (a) = {a} $ FOLLOW(S), where S is the start symbol, $ is the end symbol of the input string
FIRST, FOLLOW FIRST (b) = {b} For each production : FIRST()\{} FOLLOW()
FIRST (c) ={c} For each production , where FIRST(): FOLLOW() FOLLOW()
FIRST (*) ={*} For each production : FOLLOW() FOLLOW()
FOLLOW(A)={$, b} FIRST(A)={b}
FOLLOW(A’)={$, b} FIRST(A’)={a, ε}
FOLLOW(E)={a, $, b} FIRST(E)={b}
FOLLOW(E’)={a, $, b} FIRST(E’)={*, ε}
FOLLOW(T)={*, a, $, b} FIRST(T)={a, c}
Exercise 1
• LL(1) grammar
A EA’
FIRST (a) = {a}
A’ aEA’ | FIRST (b) = {b} • For each production in G do:
• For each terminarl t FIRST() do M[A, t] =
E bE’ FIRST (c) ={c}
• If FIRST(), for each tFOLLOW(A) do M[A, t] =
E’ *TE’ | FIRST (*) ={*}
• If FIRST() and $FOLLOW(A) do M[A, $] =
T aAb | c FIRST(A)={b}
FIRST(A’)={a, ε}
Show the parsing table FIRST(E)={b} NON- INPUT SYMBOL
FIRST(E’)={*, ε}
TERMINAL a b c * $
FIRST(T)={a, c}
A EA’
FIRST(EA’)=FIRST(E)={b}
FOLLOW(A)={$, b} FIRST(aEA’)=FIRST(a) ={a} A’ aEA’ ε ε
FOLLOW(A’)={$, b} FIRST(bE’)=FIRST(b) ={b} E bE’
FOLLOW(E)={a, $, b} FIRST(*TE’)= FIRST(*) ={*}
FIRST(aAb)=FIRST(a)={a}
E’ ε ε *TE’ ε
FOLLOW(E’)={a, $, b}
FOLLOW(T)={*, a, $, b} FIRST(ε)= {ε} T aAb c
Exercise 1 MATCHED STACK
A$
INPUT
b*abb$
ACTION
FOLLOW(B)={$} FIRST(B)={a }
FOLLOW(A)={a, $} FIRST(A)={b, c, ε}
FOLLOW(F)={b, c, $} FIRST(F)={a}
FOLLOW(F’)={b, c, $} FIRST(F’)={-, b, c}
FOLLOW(E)={b, c, $} FIRST(E)={-}
Exercise 2
• LL(1) grammar
B → FA • For each production in G do:
F a-F’ • For each terminarl t FIRST() do M[A, t] =
F’ E | b | c • If FIRST(), for each tFOLLOW(A) do M[A, t] =
FIRST (a) = {a} • If FIRST() and $FOLLOW(A) do M[A, $] =
E → -Aa FIRST (b) = {b}
FIRST (c) ={c}
A→b|c| FIRST (-) ={-}
B$
INPUT
a--cab$
ACTION
Devise predictive parsers for the input string a--cab a-- AaA$ cab$ Match -
a-- caA$ cab$ Output A c
NON- INPUT SYMBOL a--c aA$ ab$ Match c
TERMINAL a b c - $ a--ca A$ b$ Match a
B FA a--ca b$ b$ Output A b
a--cab $ $ Match b
A ε b c ε
F a-F’
F’ b c E
E -Aa
Exercise 2
• LL(1) grammar B
B → FA
F
F a-F’ A
F’ E | b | c a F’
E → -Aa - b
A→b|c| E
Give a parse tree for the input string a--cab - a
A
FOLLOW(S)={$, )} FIRST(S)={a}
FOLLOW(S’)={$, )} FIRST(S’)={+, b, ε}
FOLLOW(E)={+, b, $, )} FIRST(E)={+}
FOLLOW(E’)={+, b, $, )} FIRST(E’)={(, c, b}
FOLLOW(T)={+, b, $, )} FIRST(T)={(, c}
Exercise 3
• LL(1) grammar FIRST (a) = {a}
S aS’ FIRST (b) = {b} • For each production in G do:
S’ ES’ | bS’ | FIRST (c) = {c} • For each terminarl t FIRST() do M[A, t] =
E +E’ FIRST (+) ={+} • If FIRST(), for each tFOLLOW(A) do M[A, t] =
FIRST (() ={(} • If FIRST() and $FOLLOW(A) do M[A, $] =
E’ T | b FIRST ()) ={)}
T (S) | c
FIRST(S)={a}
Show the parsing table FIRST(S’)={+, b, ε}
FIRST(E)={+} NON- INPUT SYMBOL
FIRST(E’)={(, c, b} TERMINAL a b c + ( ) $
FIRST(T)={(, c}
S aS’
FIRST(aS’)=FIRST(a)={a}
FOLLOW(S)={$, )} FIRST(ES’)=FIRST(E) ={+} S’ bS’ ES’ ε ε
FOLLOW(S’)={$, )} FIRST(bS’)=FIRST(b) ={b} E +E’
FOLLOW(E)={+, b, $, )} FIRST(+E’)= FIRST(+) ={+}
FOLLOW(E’)={+, b, $, )}
E’ b T T
FIRST((S))=FIRST(()={(}
FOLLOW(T)={+, b, $, )} FIRST(ε)= {ε} T c (S)
Exercise 3 MATCHED STACK
S$
INPUT
a+(ab)$
ACTION
b S’