0% found this document useful (0 votes)
70 views34 pages

Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq

1. The document discusses the role of the parser in compiler construction. The parser checks syntax and invokes semantic actions. 2. A key role of the parser is to produce an intermediate representation (IR) of the source program using syntax-directed translation, such as an abstract syntax tree or control flow graphs. 3. The document covers topics like context-free grammars, derivations, left recursion elimination, and error handling strategies in parsers.

Uploaded by

Faisal Shehzad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views34 pages

Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq

1. The document discusses the role of the parser in compiler construction. The parser checks syntax and invokes semantic actions. 2. A key role of the parser is to produce an intermediate representation (IR) of the source program using syntax-directed translation, such as an abstract syntax tree or control flow graphs. 3. The document covers topics like context-free grammars, derivations, left recursion elimination, and error handling strategies in parsers.

Uploaded by

Faisal Shehzad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Compiler Construction

CS-4207

Instructor Name: Atif Ishaq


Lecture 8-9
Today’s Lecture

 Role of Parser

2
Role of Parser

token
Source Lexical
Parse tree Rest of Front Intermediate
Parser
program Analyzer End representation
getNext
Token

Symbol
table

3
Role of Parser

 A grammar gives a precise, yet easy to understand , syntactic specification of a

programming language

 The role of the parsing process is to reconstruct a derivation by which a given

Context Free Grammar can generate a given input string

 Equivalently, construct the parsing tree which represents the derivation

4
The Role of Parser

 A Parser implements a Context Free Grammar as a recognizer of strings


 The role of parser in compiler is two fold
 To check syntax (string recognizer)
 Report any syntax errors accurately
 Recover from commonly occurring error to continue processing
 To invoke semantic actions
 For static semantics checking, e.g. type checking of expressions, functions,
etc.
 For syntax-directed translation of the source code to an intermediate
representation

5
Syntax Directed Translation

 One of the major role of parser is to produce an intermediate representation


(IR) of the source program using syntax directed translation
 Possible IR outputs
 Abstract Syntax Tree
 Control Flow Graphs (CFGs) with triples, three address code, or register
transfer list notation

6
Representative Grammar

 Construct that begins with keywords are easy to parse


 Expression presents more challenges, because of associativity and precedence
of operators
Recap of Grammar
 Context Free Grammar is 4 tuple – G (V,T,P,S)
1. T is finite set of tokens (terminal symbols)
2. V is a finite set of nonterminal
3. P is a finite set of productions of the form   
where   V and   (VT)*
4. S  V is a designated start symbol

7
Notation Conventions
 Terminals
a,b,c,…  T
specific terminals: 0, 1, id, +

 Nonterminals
A,B,C,…  N
specific nonterminals: expr, term, stmt

 Grammar symbols
X,Y,Z  (VT)

 Strings of terminals
u,v,w,x,y,z  T*

 Strings of grammar symbols


,,  (VT)*

8
Chomsky Hierarchy : Language Classification
 A grammar G is said to be
 Regular if it is right linear where each production is of the form

AwB or Aw

or left linear where each production is of the form


A  B wor Aw

 Context free if each production is of the form

A
where A  N and   (VT)*

 Context sensitive if each production is of the form

A
where A  V, ,,  (VT)*, || > 0

 Unrestricted 9
Chomsky Hierarchy : Language Classification

L(regular)  L(context free)  L(context sensitive)  L(unrestricted)

Where L(T) = { L(G) | G is of type T }


That is: the set of all languages
generated by grammars G of type T

Examples:
Every finite language is regular! (construct a FSA for strings in L(G))
L1 = { anbn | n  1 } is context free
L2 = { anbncn | n  1 } is context sensitive

10
Chomsky Hierarchy : Language Classification

Venn Diagram of Grammar Types:

Type 0 – Phrase-structure (Unrestricted)


Type 1 –
Context-Sensitive
Type 2 –
Context-Free

Type 3 –
Regular

11
Derivation

 The one-step derivation is defined by

A

where A   is a production in the grammar

 In addition, we define

  is leftmost lm if  does not contain a nonterminal

  is rightmost rm if  does not contain a nonterminal

 Transitive closure * (zero or more steps)

 Positive closure + (one or more steps)

 The language generated by G is defined by

L(G) = {w  T* | S + w}
12
Derivation - Example

Grammar G = ({E}, {+,*,(,),-,id}, P, E) with

productions P = E  E + E
EE*E
E(E)
E-E
E  id

Example derivations:

E  - E  - id

E rm E + E rm E + id rm id + id

E * E

E * id + id

E + id * id + id 13
Recursive Decent Parsing

 Top down parsing builds tree from top to bottom

 Each non terminal corresponds to one recursive procedure

 The procedure recognizes a prefix of the input that can be generated by the

corresponding non-terminal, it consumes the prefix, and returns a parse tree for

the non terminal

14
General Structure

 Each right hand side of a production provides part of the body of the function

 Each non terminal on the right hand side is translated into a call to the function

(procedure) that recognizes that non terminal

 Each terminal on the right hand side is translated into a call to the lexical

scanner. Failure If the resulting token is not the expected terminal

 Each recognizing function return a tree framgment

15
Complication : Left Recursion

 A grammar can not be left-recursive

 Original schemes leads to an infinite loop : grammar is inappropriate for

recursive decent

16
Eliminating Left Recursion

17
Left Recursion

 Production of the form


AA| |
are left recursive
When one of the productions in a grammar is left recursive then a
predictive parser loops forever on certain inputs

A grammar is left recursive if it has a non-terminal A such that there is a


derivation A ⇒ Aα

Top down parsing methods cant handle left-recursive grammars


A simple rule for direct left recursion elimination:
For a rule like:
A → A α|β
We may replace it with
A → β A'
A' → α A' | ɛ

18
Left Recursion

General form (with left recursion):


A → A 1 | A 2 | ... | A n | 1 | 2 | ... | m
After transformation:
==> A → 1 A' | 2 A' | ... | m A'
A' → 1 A' | 2 A' | ... | n A' | 

How about left recursion occurred for derivation with more than two steps?
e.g., S → Aa | b A → Ac | Sd | ε
Now, S → Aa | b
A →Ac | Aad | bd | ε
After transformation:

19
Left Recursion Elimination Method

Input: Grammar G with no cycles or -productions


Arrange the nonterminals in some order A1, A2, …, An
for i = 1, …, n do
for j = 1, …, i-1 do
replace each
Ai  Aj 
with
Ai  1  | 2  | … | k 
where
Aj  1 | 2 | … | k
enddo
eliminate the immediate left recursion in Ai
enddo

20
Left Recursion involving several Non-Terminal

21
Further Complications

22
Error Handling

 A good compiler should assist in identifying and locating errors


1. Lexical Error : misspelling of identifiers, keywords or operators – e.g. the
use of identifier spel instead of spell – and missing quotes around text
intended as string
2. Syntactic Error : misplaced semicolons or extra or missing braces
3. Static Semantic Error : type mismatch between operators and operands –
return a value to void return type also fall in this category
4. Dynamic Semantic Error : hard or impossible to detect at compile time,
runtime checks are required
5. Logical Error : can be anything from incorrect reasoning on the part of
programmer – use of assignment operator in place of comparison operator

23
Error Recovery Strategies

Panic-Mode Recovery
 Discards input symbols one at a time until one of the designated set of
synchronized token (such as semicolon or }) is found
 Skips a considerable amount of input without checking it for additional
errors
Phrase Level Recovery
 Perform local correction on the remaining input – may replace prefix of
remaining input with some string
 Typical local correction is to replace a comma by a semicolon, delete an
extraneous semicolon, or insert a missing semicolon
Error Production
 Augment (Expand) grammar with production for erroneous constructs
24
Error Recovery Strategies

Global Correction

 Choose a minimal sequence of changes to obtain a global least-cost correction

 Given an incorrect input symbol x for grammar G try to find a parse tree for a
related string y – transform x into y

 Unfortunately, these methods are in general too costly to implement in terms


of time and space, so these techniques are currently only of theoretical
interest.

25
Types of Parser

 Three general types of parser


1. Universal
• Not efficient now
2. Top-down
 Build parse tree from top (root) to bottom (leaves)
 Recursive decent (predictive parsing)
 LL (left to right, left most derivation method)
3. Bottom-up
• Starts from leaves and works their way up to root
• Operator precedence parsing
• LR (left to right, right most derivation) method
Importantly in all cases the input to the parser is scanned from left to right,
one symbol at a time 26
Top Down Parsing

 A top-down parser starts with the root of the parse tree. The root node is

labeled with the goal (start) symbol of the grammar. The top-down parsing

algorithm proceeds as follows:

1. Construct the root node of the parse tree

2. Repeat until the fringe of the parse tree matches input string

a. At a node labeled A, select a production with Aon its lhs

b. for each symbol on its rhs, construct the appropriate child

c. When a terminal symbol is added to the fringe and it does not match

the fringe, backtrack


27
Top Down Parsing

 LL Methods (Left to right . Left most derivation) and recursive decent parsing

Grammar: Leftmost derivation:


ET+T E lm T + T
T(E) lm id + T
T-E lm id + id
T  id

28
Top Down Parsing : Backtracking Problem

1. Left recursion

 can cause a top-down parser to go into infinite loop

 A grammar is said to be left recursive if it has a nonterminal A such that

there is a derivation A ⇒ A α for some  .

2. Backtracking

 It undo not only the movement but also the semantic entry in the symbol

table

3. The order the alternatives are tried

29
Ambiguity

 More than one parse tree (For some strings)

 More than one left most derivation


+
 More than one rightmost derivation

 Example : id + id * id
EE+E EE*E
 id + E E+E*E
 id + E * E  id + E * E
 id + id * E  id + id * E
 id + id * id  id + id * id

30
Eliminating Ambiguity

31
Eliminating Ambiguity

 The grammar is ambiguous as the below given string generates two different

parse trees

Two different parse trees

32
Preferred
Eliminating Ambiguity

 General idea to eliminate ambiguity

 A statement appearing between a then and an else must be matched

33
34

You might also like