0% found this document useful (0 votes)

9 views

Lec4 SyntaxAnalysis

This document provides an introduction to parsing and context-free grammars. It discusses how a parser uses the grammar of a language to verify that a string of tokens can be generated by the grammar, report any syntax errors, and construct a parse tree representation. It defines key parsing concepts like context-free grammars, productions, terminals, non-terminals, derivations, parse trees, left-most and right-most derivations, ambiguity, and dealing with ambiguity in grammars.

Uploaded by

areejalqahtani214

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Lec4 SyntaxAnalysis

Uploaded by

areejalqahtani214

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

Introduction to Parsing

Ambiguity and Syntax Errors

Outline

• Parser overview

• Context-free grammars (CFG’s)

• Derivations

• Ambiguity

• Syntax errors

2
What is the job of Syntax Analysis?
• Syntax Analysis is also called Parsing or Hierarchical
Analysis.
• A Parser implements grammar of the language may it be java,
C, C++ etc
• The parser obtains a string of tokens from the lexical analyzer
and :
• verifies that the string can be generated by the
grammar for the source language
• Reports any syntax errors in the program
• Constructs a parse tree representation of the
program
• usually calls the lexical analyzer to supply a token to it
when necessary
• The grammar that a parser implements is called a Context
The Functionality of theParser

• Input: sequence of tokens from lexer

• Output: parse tree of the program

Comparison with Lexical Analysis:
Phase Input Output
Lexer Sequence of Sequence of
characters tokens
Parser Sequence of Parse tree
tokens
4
Example

• If-then-else statement
if (x == y) then z =1; else z = 2;
• Parser input
IF (ID == ID) THEN ID = INT; ELSE ID =
INT;
• Possible parser output

IF-THEN-ELSE
== = =

I I I IN I IN
D D D T D T
5
The Role of the Parser
• Not all sequences of tokens are programs ...
• Parser must distinguish between valid and
invalid sequences of tokens
• The Role is:
1. To check syntax (= string recognizer)
And to report syntax errors accurately
2. To invoke semantic actions
For static semantics checking, e.g. type checking of
expressions, functions, etc.
• We need
– A language for describing valid sequences of
tokens
– A method for distinguishing valid from invalid
6
What is the difference between Syntax and Semantic?

 Syntax is the way in which we construct sentences

by following principles and rules.

 Semantics is the interpretations of and meanings

derived from the sentence transmission and
understanding of the message or in other words
are the logical sentences making sense or not
Context-Free Grammars
• Many programming language constructs have a
recursive structure

• A STMT is of the form

if COND then STMT else STMT , or
while COND do STMT , or
…
• Context-free grammars are a natural notation
for this recursive structure

8
CFGs (Cont.)
 A CFG consists of
– A set of terminals T
– A set of non-terminals N
– A start symbol S (a non-terminal)
– A set of productions

Assuming X  N the productions are of the form

X , or
X  Y1 Y2 ... Yn where Yi  N T

10
Notational Conventions
Terminals: Lower case letters, operator symbols,
punctuation symbols, digits, boldface strings are all
terminals

Non Terminals: Upper case letters, lower case italic

names are usually non terminals
• Greek letters such as ,, represent strings of
grammars symbols. Thus a generic production can
be written as A 
• The start symbol is the left-hand side of the first
production

10
Examples of CFGs

A fragment of our example language

(simplified):

STMT  if COND then STMT else STMT

| while COND do STMT
| id = int

11
Examples of CFGs (cont.)
Grammar for simple arithmetic expressions:

E  E*
E
|E+E
|(E)
|id

12
The Language of a CFG
Read productions as replacement rules:

X  Y1 ... Yn
Means X can be replaced by Y1 ... Yn
X
Means X can be erased (replaced with empty string)

13
Key Idea

(1) Begin with a string consisting of the start

symbol “S”
(2) Replace any non-terminal X in the string
by a right-hand side of some production
X  Y1 . . Y n
(3) Repeat (2) until there are no non-terminals
in the string

14
The Language of a CFG (Cont.)
More formally, we write
X1 . . X i . . X n  X1 ..X i  1 Y 1 . . Y m Xi1 . . X n

if there is a production
Xi  Y1 . . Y m
Write
X1 . . X n Y1 . . Y m
if
X1 . . X n .. .. 
Y1 . . Y m
15
The Language of a CFG
Let G be a context-free grammar with start
symbol S. Then the language of G is:

{a1 . . a n  | S  a1 . . a n and every ai is a

terminal }
Terminals

• Terminals are called so because there are no

rules for replacing them
• Once generated, terminals are permanent
• Terminals ought to be tokens of the
language 16
Examples

L(G) is the language of the CFG G

Strings of balanced parentheses

( ) i i
|i
Two grammars:

S  (S ) S 0(S)
or
S   | 

20
Example

A fragment of our example language (simplified :

STMT  if COND then STMT

|if COND then STMT else STMT
|while COND do STMT
|id = int
COND  (id == id)
|
(id != id)

18
Arithmetic Example

Simple arithmetic expressions:

E  E+E | E E | (E) |
id
Some elements
id of the id
language:
+ id
(id) id 
(id) id id
 (id)
 19
Derivations and Parse Trees

A derivation is a sequence of productions

S.. .. ..

A derivation can be drawn as a tree
– Start symbol is the tree’s root
– For a production
X  Y1 . . Y n add
children Y1 . . Y n to node X

20
Derivation Example
• Grammar

E  E+E | E E | (E) | id
• String id  id + id E
E
 E+E
E + E
 E  E+E
 id E + E E * id
 id id + E E
 id id + id id id
21
Derivation Example
• Grammar

E  E+E | E E | (E) | id
• String id  id + id
Notes on Derivations
• A parse tree has
– Terminals at the leaves
– Non-terminals at the interior nodes

• An in-order traversal of the leaves is the

original input

• The parse tree shows the association of

operations, the input string does not

23
Left-most and Right-most
Derivations
E • What was shown before
was a left-most derivation
 E+E – At each step, replace the
 left-most non-terminal
E+id
 E  E + id • There is an equivalent
 E id + id notion of a right-

most derivation
id id + id – Shown on the right

24
Right-most Derivation in Detail
E E E

E  E+E
E + E

E
E
 E+E E + E

 E+id
id

25
Right-most Derivation in Detail (2)
E E
 E+E
 E+id E + E

 E  E + id
E * id
 E id + id E
 id id + id id
• Note that right-most and left-most
id
derivations have the same parse tree
• The difference is just in the order in
which branches are added
26
Ambiguity
• Grammar:
EE+E|E* E | (E)|
int

• The string int * int + int has two parse trees

E
E E+ E * E
E
E * E int int
E + E
int int int
int 27
Ambiguity (Cont.)
 A grammar is ambiguous if it has more
than one parse tree for some string
– Equivalently, there is more than one right-
most or left-most derivation for some string
 Ambiguity is bad
– Leaves meaning of some programs ill-defined
 Ambiguity is common in programming
languages
– Arithmetic expressions
– IF-THEN-ELSE

28
Dealing with Ambiguity

• There are several ways to handle ambiguity

• Most direct method is to rewrite grammar

unambiguously

ET+E|T
T  int * T | int | ( E )

• This grammar enforces precedence of * over +

29
The Dangling Else: Example
Consider the following grammar S  if C then S

• The expression |if

C then S else S
if C1 then if C2 then S3 else S4
|
has twoOTHER
parse trees
if if

C1 if S4 C1 if

C2 S3
S3 S4
C2

• Typically we want the second form

30
The Dangling Else: A Fix
• else should match the closest unmatched then
• We can describe this in the grammar

S  MIF /* all then are matched */

| /* some then are unmatched */
MIFUIF
 if C then MIF else MIF
| OTHER
UIF  if C then S
| if C then MIF else UIF

• Describes the same set of strings

The Dangling Else: Example Revisited
• The expression if C1 then if C2 then S3 else S4

if if

C1 if C1 if S4

C2 C2
S3
S3
• Not valid because the
S4 then expression is
not a MIF
• A valid parse tree 32
Ambiguity
• No general techniques for handling ambiguity

• Impossible to convert automatically an

ambiguous grammar to an unambiguous one

• Used with care, ambiguity can simplify the

grammar
– Sometimes allows more natural definitions
– We need disambiguation mechanisms

33
Precedence and Associativity Declarations
• Instead of rewriting the grammar
– Use the more natural (ambiguous) grammar
– Along with disambiguating declarations

• Most tools allow precedence and associativity

declarations to disambiguate grammars

• Examples …

34
Associativity Declarations
• Consider the grammar EE+E|
int
• Ambiguous: two parse trees of int + int + int

E E
E + E + E
E
E + E int int E + E

int int int int

• Left associativity declaration: %left +

35
Precedence Declarations
• Consider the grammar E  E + E * E | int
| E And the string int + int * int

E E
* E + E
E
E + E int int E *
E
int int int
int
• Precedence declarations: %left
+
36
Error Handling
 Purpose of the compiler is
– To detect non-valid programs
– To translate the valid ones
 Many kinds of possible errors (e.g. in C)
Error kind Example Detected by …
Lexical …$… Lexer
Syntax … x *% … Parser
Semantic … int x; y = x(3); … Type checker
Correctness your favorite program Tester/User

37
Error Handling
 A good compiler should assist in identifying and locating errors
◦ Lexical errors: important, compiler can easily recover and
continue such as misspelling an identifier, keyword etc.
◦ Syntax errors: most important for compiler, can almost
always recover such as arithmetic expression with unbalanced
parenthesis
◦ Static semantic errors: important, can sometimes recover
such as operator applied to incompatible operands
◦ Dynamic semantic errors: hard or impossible to detect at
compile time, runtime checks are required
◦ Logical errors: hard or impossible to detect such as infinite
recursive calls

38
Syntax Error Handling
 Error handler should
– Report errors accurately and clearly
– Recover from an error quickly
– Not slow down compilation of valid code

• Good error handling is not easy to achieve

39
Approaches to Syntax Error Recovery
 Approaches from simple to complex
– Panic mode
– Error productions
– Automatic local or global correction
• Panic mode is the simplest, most popular method
• When an error is detected:
– Discard tokens until one with a clear role is
found
– Continue from there

• Such tokens are called synchronizing tokens

– Typically the statement or expression
terminators
40
Questions???

COSC3054 Lec 03 I Grammars (4)
No ratings yet
COSC3054 Lec 03 I Grammars (4)
96 pages
Principles of Programming Languages: Syntax Analysis
No ratings yet
Principles of Programming Languages: Syntax Analysis
51 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Chapter 4_01a0a63b848e0c15cdfbc525231434fc
No ratings yet
Chapter 4_01a0a63b848e0c15cdfbc525231434fc
62 pages
CS6109-MODULE-4
No ratings yet
CS6109-MODULE-4
36 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
CH2-1 To CH2-3
No ratings yet
CH2-1 To CH2-3
79 pages
Lecture 1 Introduction DR Raheel 19022024 032426pm
No ratings yet
Lecture 1 Introduction DR Raheel 19022024 032426pm
32 pages
4.parsing
No ratings yet
4.parsing
32 pages
Entrepreneurship Process
No ratings yet
Entrepreneurship Process
22 pages
Compilers - Week 3
No ratings yet
Compilers - Week 3
17 pages
CH03
No ratings yet
CH03
57 pages
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
No ratings yet
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
70 pages
Parsing 120903115324 Phpapp02
No ratings yet
Parsing 120903115324 Phpapp02
20 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
No ratings yet
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
28 pages
Class Three
No ratings yet
Class Three
74 pages
CH2 1
No ratings yet
CH2 1
27 pages
L4 Formal Grammers
No ratings yet
L4 Formal Grammers
23 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
[Week 3] Syntax Analysis (Derivation)
No ratings yet
[Week 3] Syntax Analysis (Derivation)
46 pages
Computing Theory Lecture 4
No ratings yet
Computing Theory Lecture 4
25 pages
Ch2 Modified
No ratings yet
Ch2 Modified
39 pages
Grammar and Parse Trees (Syntax) : What Makes A Good Programming Language?
100% (2)
Grammar and Parse Trees (Syntax) : What Makes A Good Programming Language?
50 pages
4th - Syntax Analysis
No ratings yet
4th - Syntax Analysis
29 pages
Lecture 04
No ratings yet
Lecture 04
51 pages
Compiler Theory: (A Simple Syntax-Directed Translator)
No ratings yet
Compiler Theory: (A Simple Syntax-Directed Translator)
50 pages
CD UNIT 3
No ratings yet
CD UNIT 3
76 pages
PCD 1.4 Syntax Analysis
No ratings yet
PCD 1.4 Syntax Analysis
33 pages
CC lec 7
No ratings yet
CC lec 7
16 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
chapter 3
No ratings yet
chapter 3
57 pages
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
No ratings yet
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
22 pages
Compiler 2
No ratings yet
Compiler 2
45 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
Lecture05-Syntax Analysis-CFG
No ratings yet
Lecture05-Syntax Analysis-CFG
19 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Parser Lec1
No ratings yet
Parser Lec1
20 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
160 pages
Compiler 2
100% (1)
Compiler 2
45 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
003chapter 3 - Syntax Analysis
No ratings yet
003chapter 3 - Syntax Analysis
171 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
Lecture 4- Syntax Analysis (1)
No ratings yet
Lecture 4- Syntax Analysis (1)
66 pages
Lesson 3: Syntax Analysis: Risul Islam Rasel
No ratings yet
Lesson 3: Syntax Analysis: Risul Islam Rasel
106 pages
UNIT-I Part 2 Describing Syntax and Semantics
No ratings yet
UNIT-I Part 2 Describing Syntax and Semantics
70 pages
Parsing Notes
No ratings yet
Parsing Notes
96 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
17 CFGremove Ambiguity Optional
No ratings yet
17 CFGremove Ambiguity Optional
30 pages
CSC441-Lesson 04
No ratings yet
CSC441-Lesson 04
40 pages
Context Free Grammars
No ratings yet
Context Free Grammars
39 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Formation Js Back End
No ratings yet
Formation Js Back End
94 pages
Compiler Lab Manual
No ratings yet
Compiler Lab Manual
82 pages
Chapter 4
No ratings yet
Chapter 4
53 pages
Learning Outcome Compiler
No ratings yet
Learning Outcome Compiler
4 pages
Scanner Class in Java
No ratings yet
Scanner Class in Java
7 pages
Unit 2 Compiler
No ratings yet
Unit 2 Compiler
42 pages
ECMA-262 7th Edition June 2016
No ratings yet
ECMA-262 7th Edition June 2016
586 pages
App PRJ
No ratings yet
App PRJ
11 pages
Error Recovery
No ratings yet
Error Recovery
16 pages
Tokens and Python's Lexical Structure
No ratings yet
Tokens and Python's Lexical Structure
15 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
Compiler Design Course Outline
No ratings yet
Compiler Design Course Outline
3 pages
Compiler Design.: Why To Learn About Compilers
No ratings yet
Compiler Design.: Why To Learn About Compilers
12 pages
Practical-1: Aim: To Study About Lexical Analyzer Generator (Lex)
No ratings yet
Practical-1: Aim: To Study About Lexical Analyzer Generator (Lex)
2 pages
Compiler Design Lecture Notes (10CS63) : D C S & E
No ratings yet
Compiler Design Lecture Notes (10CS63) : D C S & E
96 pages
Pascal Syntax v102
No ratings yet
Pascal Syntax v102
36 pages
Dcit23a Midterm Reviewer
No ratings yet
Dcit23a Midterm Reviewer
7 pages
DS Through Java
100% (2)
DS Through Java
104 pages
Foundations of Sequential Programming MCQ
No ratings yet
Foundations of Sequential Programming MCQ
3 pages
NYT Dataset
No ratings yet
NYT Dataset
16 pages
Cs8602-CD (Partb & C)
No ratings yet
Cs8602-CD (Partb & C)
2 pages
Cse 6TH Sem Syllabus
No ratings yet
Cse 6TH Sem Syllabus
9 pages
Cross Compilers
No ratings yet
Cross Compilers
14 pages
Neural Translation Model (Capstone Project)
No ratings yet
Neural Translation Model (Capstone Project)
20 pages
5 BASIC TEXT PROCESSING
No ratings yet
5 BASIC TEXT PROCESSING
6 pages
Compiler Design Unit 4 By Dr. Choudhary Ravi Singh
No ratings yet
Compiler Design Unit 4 By Dr. Choudhary Ravi Singh
15 pages
Rishi Mitacs CV
No ratings yet
Rishi Mitacs CV
2 pages
The C Preprocessor: Richard M. Stallman, Zachary Weinberg
No ratings yet
The C Preprocessor: Richard M. Stallman, Zachary Weinberg
85 pages
Compiler Construction
0% (1)
Compiler Construction
19 pages
Basics of compilation process COM 413
No ratings yet
Basics of compilation process COM 413
31 pages