0% found this document useful (0 votes)

138 views41 pages

Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language

This document provides an overview of lexical and syntax analysis for programming language concepts. It discusses how lexical analysis identifies substrings called lexemes and associates them with tokens. Syntax analysis consists of two parts - a lexical analyzer and parser. The parser uses a context-free grammar defined in Backus-Naur Form (BNF) to analyze the structure of a program and check for errors. Approaches to parsing include recursive descent and table-driven bottom-up parsing.

Uploaded by

Md. Amdadul Bari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

138 views41 pages

Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language

Uploaded by

Md. Amdadul Bari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

LECTURE 04

LEXICAL AND SYNTAX

ANALYSIS
CSE 325/CSE 425:
CONCEPTS OF
PROGRAMMING LANGUAGE

INSTRUCTOR: DR. F. A. FAISAL

OUTLINE

•  Introduction
•  Lexical Analysis
•  The Parsing Problem
•  Recursive-Descent Parsing
•  Bottom-up Parsing
INTRODUCTION

•  Language implementation systems must analyze source

code, regardless of the specific implementation approach.
•  Nearly all syntax analysis is based on a formal description
of the syntax of the source language (BNF).
SYNTAX ANALYSIS

•  The syntax analysis portion of a language processor

nearly always consists of two parts;
•  A low-level part called a lexical analyzer (mathematically, a
finite automaton based on a regular grammar)
•  A high-level part called a syntax analyzer, or parser
(mathematically, a push-down automaton based on a
context-free grammar or BNF)
ADVANTAGES OF USING
BNF TO DESCRIBE SYNTAX

•  Provides a clear and concise syntax description

•  The parser can be based directly on the BNF
•  Parsers based on BNF are easy to maintain.
REASONS TO SEPARATE LEXICAL
AND SYNTAX ANALYSIS

•  Simplicity- less complex approaches can be used for

lexical analysis; separating them simplifies the parser.
•  Efficiency- separation allows optimization of the lexical
analyzer.
•  Portability- Parts of the lexical analyzer may not be
portable, but the parser always is portable.
LEXICAL ANALYSIS
•  A lexical analyzer is a pattern matcher for character
strings.
•  A lexical analyzer is a “front-end” for the parser
•  Identifies substrings of the source program that belong
together - lexemes
•  Lexemes match a character pattern, which is associated
with a lexical category called as token.
•  sum is a lexeme; its token may be IDENT.
(CONT.)

•  The lexical analyzer is usually a function that is called by

the parser when it needs the next token.
•  Three approaches to building a lexical analyzer:
•  Write a formal description of the tokens and use a software
tool that constructs a table-driven lexical analyzer from
such a description.
•  Design a state diagram that describe the tokens and write
a program that implements the state diagram
•  Design a state diagram that describes the tokens and
hand-construct a table-driven implementation of the state
diagram.
STATE DIAGRAM
DESIGN
•  Helps to describe the behavior.
•  A naïve state diagram would have a transition from every
state on every character in the source language- such a
diagram would be vary large!
LEXICAL ANALYSIS
(CONT.)
•  In many cases, transitions can be combined to simplify
the state diagram.
•  When recognizing an identifier, all uppercase and
lowercase letters are equivalent.
•  Use a character class that includes all letters.
•  When recognizing an integer literal, all digits are
equivalent – use a digit class
•  Reserved words and identifiers can be recognized
together (rather than having a part of the diagram for each
reserved word).
•  Use a table lookup to determine whether a possible
identifier is in fact a reserved word.
LEXICAL ANALYSIS
(CONT.)
•  Conveninent utility subprograms:
•  getChar – gets the next character of input, puts it in
nextChar, determines its class and puts the class in
charClass
•  addChar- puts the character from nextChar into the place
the lexeme is being accumulated, lexeme
•  lookup- determines whether the string in lexemes is a
reserved word (returns a code)
PROGRAM
PROGRAM (CONT.)
SAMPLE OUTPUT
STATE DIAGRAM
THE PARSING
PROBLEM
•  Goals of the Parser, given an input program:
•  Find all syntax errors; for each, product an appropriate
diagnostic message and recover quickly
•  Produce the parse tree, or at least a trace of the parse tree,
for the program
•  Two categories of parsers
•  Top down- Produce the parse tree, beginning at the root
•  Order is that of a leftmost derivation
•  Traces or builds the parse tree in preorder
•  Bottom up- Produce the parse tree, beginning at the leaves
•  Order is that of the reverse of a rightmost derivation
•  Useful parsers look only one token ahead in the input
(CONT.)
•  Top-down Parsers
•  Given a sentential form, xAα, the parser must choose the
correct A-rule to get the next sentential form in the leftmost
derivation, using only the first token produced by A.
•  The most common top-down parsing algorithms:
•  Recursive descent- a coded implementation
•  LL parsers- table driven implementation
•  Bottom-up Parsers
•  Given a right sentential form, α, determine what substring
α is the right-hand side of the rule in the grammar that
must be reduced to produce the previous sentential form in
the right derivation.
•  The most common bottom-up parsing algorithms are in the
LR family.
COMPLEXITY OF
PARSING

•  Parsers that work for any unambiguous grammar are

complex and inefficient (O(n3), where n is the length of the
input)
•  Commercial Compilers use parsers that only work for a
subset of all unambiguous grammars, but do it in linear
time (O(n))
RECURSIVE-DESCENT
PARSING
•  There is a subprogram for each nonterminal in the
grammar, which can parse sentences that can be
generated by that nonterminal
•  EBNF is ideally suited for being the basis for a recursive-
descent parser, because EBNF minimizes the number of
nonterminals.
•  A grammar for simple expressions:
(CONT.)

•  Assume we have a lexical analyzer named lex, which puts

the next token code in nextToken
•  The coding process when there is only one RHS:
•  For each terminal symbol in the RHS, compare it with the
next input token; if they match, continue, else there is an
error.
•  For each nonterminal symbol in the RHS, call its
associated paring subprogram
PROGRAM
N.B.- This particular routine does
not detect errors.
Every parsing routine leaves the
next token in nextToken.
(CONT.)

•  A nonterminal that has more than one RHS requires an

initial process to determine which RHS it is to parse
•  The correct RHS is chosen on the basis of the next token
of input
•  The next token is compared with the first token that can be
generated by each RHS until a match is found
•  If no match is found, it is a syntax error
PROGRAM
PROGRAM
OUTPUT
PARSE TREE
THE LL GRAMMAR
CLASS
A -> A + B (this requires the activation of the A parser subprogram and calls
itself again and again)

ε specifies as the empty string

EXAMPLE
PAIRWISE
DISJOINTNESS TEST

Read by yourself
BOTTOM-UP PARSING
•  The parsing problem is finding the correct RHS in a right-
sentential form to reduce to get the previous right-
sentential form in the derivation
PARSE TREE
PHRASE, SIMPLE
PHRASE AND HANDLE
PHRASE, SIMPLE
PHRASE AND HANDLE
HANDLES
EXAMPLE
SHIFT REDUCE PARSING IN COMPILERS
(WWW.GEEKSFORGEEKS.ORG)
BASIC OPERATIONS
EXAMPLE
THANKS

CS3304 9 LanguageSyntax 2 PDF
No ratings yet
CS3304 9 LanguageSyntax 2 PDF
39 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
No ratings yet
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
41 pages
L4 Syntax-Analysis
No ratings yet
L4 Syntax-Analysis
50 pages
pl9ch4 Backup
No ratings yet
pl9ch4 Backup
55 pages
pl12ch4 061259
No ratings yet
pl12ch4 061259
46 pages
CH 4
No ratings yet
CH 4
46 pages
Lexical and Syntax Analysis - Updated
No ratings yet
Lexical and Syntax Analysis - Updated
5 pages
SP Unit III-2024-25
No ratings yet
SP Unit III-2024-25
126 pages
Compiler Designnotes
No ratings yet
Compiler Designnotes
18 pages
CH 04
No ratings yet
CH 04
46 pages
Lexical and Syntax Analysis-4
No ratings yet
Lexical and Syntax Analysis-4
54 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
63 pages
Chapter 3 Compiler Design
No ratings yet
Chapter 3 Compiler Design
42 pages
Chap 04
No ratings yet
Chap 04
15 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
Compiler Rewind
No ratings yet
Compiler Rewind
52 pages
Experiment-1 Problem Definition
No ratings yet
Experiment-1 Problem Definition
28 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Microprocessor Tutorial
No ratings yet
Microprocessor Tutorial
46 pages
SPCC Case Study Parser
No ratings yet
SPCC Case Study Parser
4 pages
CD Notes
No ratings yet
CD Notes
194 pages
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
No ratings yet
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
9 pages
PL Özet (1,3,4)
No ratings yet
PL Özet (1,3,4)
8 pages
Sebesta Chapter 4 With Additions
No ratings yet
Sebesta Chapter 4 With Additions
46 pages
Comp Review: Compilers: Fall 1996 Textbook: "Compilers" by Aho, Sethi & Ullman
No ratings yet
Comp Review: Compilers: Fall 1996 Textbook: "Compilers" by Aho, Sethi & Ullman
10 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
SPCC - 5
No ratings yet
SPCC - 5
19 pages
Lecture 4 Lexical Analysis
No ratings yet
Lecture 4 Lexical Analysis
23 pages
CD Farre
No ratings yet
CD Farre
13 pages
CC LL
No ratings yet
CC LL
15 pages
Additional Notes On The Concepts Covered This Week Upto May 2 2025
No ratings yet
Additional Notes On The Concepts Covered This Week Upto May 2 2025
3 pages
CD Module2 16 03 23 PDF
No ratings yet
CD Module2 16 03 23 PDF
36 pages
03LexicalAndSyntaxAnalysis 1
No ratings yet
03LexicalAndSyntaxAnalysis 1
25 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
Lexical and Syntax Analysis: Topics
No ratings yet
Lexical and Syntax Analysis: Topics
5 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
2019 February Iat 1 Te CMPN Sem Vi SPCC
No ratings yet
2019 February Iat 1 Te CMPN Sem Vi SPCC
12 pages
Unit - 3 Syntax Analyzer
No ratings yet
Unit - 3 Syntax Analyzer
43 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
Compiler 2 PDF
No ratings yet
Compiler 2 PDF
43 pages
Compler
No ratings yet
Compler
35 pages
Chapter 3
No ratings yet
Chapter 3
9 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
33 pages
CD Notes by Quantum City AIR 107, GATE CS 2024, Shreyas Rathod Compiler
No ratings yet
CD Notes by Quantum City AIR 107, GATE CS 2024, Shreyas Rathod Compiler
37 pages
Lexical Analyzer (Compiler Contruction)
100% (1)
Lexical Analyzer (Compiler Contruction)
6 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
CD KCS502 Unit 1 B
No ratings yet
CD KCS502 Unit 1 B
12 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
18 pages
CD - CH3 - Syntax Analysis (Parsing)
No ratings yet
CD - CH3 - Syntax Analysis (Parsing)
109 pages
Complier Construction (Final)
No ratings yet
Complier Construction (Final)
8 pages
4 Parsing
No ratings yet
4 Parsing
32 pages
CD 2nd
No ratings yet
CD 2nd
2 pages
Unit 2
No ratings yet
Unit 2
61 pages
Compiler 3
No ratings yet
Compiler 3
11 pages
Lexical Analyser Lecture 4, 5, 6
No ratings yet
Lexical Analyser Lecture 4, 5, 6
66 pages
Lesson 08 2
No ratings yet
Lesson 08 2
33 pages
Lecture 02
No ratings yet
Lecture 02
150 pages
Compiler Engineering
No ratings yet
Compiler Engineering
27 pages
Compiler Design Lexical Analysis
No ratings yet
Compiler Design Lexical Analysis
24 pages
Ford-Fulkerson Algorithm
No ratings yet
Ford-Fulkerson Algorithm
6 pages
Explain Resolution in First Order Logic. Explain in Detail With Easy Example in Steps - Google Search
No ratings yet
Explain Resolution in First Order Logic. Explain in Detail With Easy Example in Steps - Google Search
2 pages
1987 - A Versatile Graph Structure For Edge-Oriented Graph Algorithms (Ebert1987AVD)
No ratings yet
1987 - A Versatile Graph Structure For Edge-Oriented Graph Algorithms (Ebert1987AVD)
7 pages
Practice Sheet-I Fuzzy Logic
No ratings yet
Practice Sheet-I Fuzzy Logic
10 pages
Hedlin Novian Napitupulu Tugas3
No ratings yet
Hedlin Novian Napitupulu Tugas3
7 pages
Application of HNN For Max Cut Problem
No ratings yet
Application of HNN For Max Cut Problem
6 pages
23-Practical Aspects of Optimization
No ratings yet
23-Practical Aspects of Optimization
7 pages
Newton-Raphson Method
No ratings yet
Newton-Raphson Method
32 pages
Bempong Kwasi Gyimah 5862816 Assignment 2
No ratings yet
Bempong Kwasi Gyimah 5862816 Assignment 2
8 pages
Wavy Curve Method
No ratings yet
Wavy Curve Method
66 pages
Python LeetCode Study Plan May9 To June1
No ratings yet
Python LeetCode Study Plan May9 To June1
3 pages
Lesson 10 Exercises Answer Key
100% (2)
Lesson 10 Exercises Answer Key
2 pages
Algorithms For Playing and Solving Games
No ratings yet
Algorithms For Playing and Solving Games
39 pages
24-25 CS18003 Data Analytics Assignment 2
No ratings yet
24-25 CS18003 Data Analytics Assignment 2
2 pages
Optimization Techniques For Decision Making MCQ 23 August 2022
No ratings yet
Optimization Techniques For Decision Making MCQ 23 August 2022
7 pages
Ai Important Questions For Viva
No ratings yet
Ai Important Questions For Viva
4 pages
Object Oriented Programming (OOP) - CS304 Power Point Slides Lecture 16
No ratings yet
Object Oriented Programming (OOP) - CS304 Power Point Slides Lecture 16
32 pages
Economic Dispatch - 2
No ratings yet
Economic Dispatch - 2
76 pages
Edx Machine Learning Course Outlines
100% (1)
Edx Machine Learning Course Outlines
4 pages
6 Control
No ratings yet
6 Control
25 pages
ToA - Lecture 21 22 - Turing Machine
No ratings yet
ToA - Lecture 21 22 - Turing Machine
104 pages
CO34563 Assignment 1
No ratings yet
CO34563 Assignment 1
3 pages
Ada Unit V
No ratings yet
Ada Unit V
28 pages
Graphormer A General-Propose Backbone For Graph Learning
No ratings yet
Graphormer A General-Propose Backbone For Graph Learning
14 pages
Lecture 6 - Python Dictionaries and Sets
No ratings yet
Lecture 6 - Python Dictionaries and Sets
19 pages
Cse-106 (Discrete Mathematics)
No ratings yet
Cse-106 (Discrete Mathematics)
7 pages
Adding and Subtracting Polynomials
No ratings yet
Adding and Subtracting Polynomials
9 pages
Artificial Intelligence DITI 3113: Uniformed Search I
No ratings yet
Artificial Intelligence DITI 3113: Uniformed Search I
51 pages
Source Code: 13.write A Program To Show Addition, Subtraction and Multiplication of Two Matrices
No ratings yet
Source Code: 13.write A Program To Show Addition, Subtraction and Multiplication of Two Matrices
8 pages
DSAP Lab Manual PDF
No ratings yet
DSAP Lab Manual PDF
61 pages

Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language

Uploaded by

Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language

Uploaded by

LECTURE 04

LEXICAL AND SYNTAX

INSTRUCTOR: DR. F. A. FAISAL

• Language implementation systems must analyze source

• The syntax analysis portion of a language processor

• Provides a clear and concise syntax description

• Simplicity- less complex approaches can be used for

• The lexical analyzer is usually a function that is called by

• Parsers that work for any unambiguous grammar are

• Assume we have a lexical analyzer named lex, which puts

• A nonterminal that has more than one RHS requires an

ε specifies as the empty string

You might also like

•  Language implementation systems must analyze source

•  The syntax analysis portion of a language processor

•  Provides a clear and concise syntax description

•  Simplicity- less complex approaches can be used for

•  The lexical analyzer is usually a function that is called by

•  Parsers that work for any unambiguous grammar are

•  Assume we have a lexical analyzer named lex, which puts

•  A nonterminal that has more than one RHS requires an