Compiler Front End

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Compiler Construction

1
Two Pass Compiler

2
Two Pass Compiler

• The figure above shows the structure of a two-pass


compiler.
• The front end maps legal source code into an
intermediate representation (IR).
• The back end maps IR into target machine code.
• An immediate advantage of this scheme is that it allows
multiple front ends and multiple passes.

3
Front End

4
Front End

• The front end recognizes legal and illegal programs


presented to it.
• When it encounters errors, it attempts to report errors
in a useful way.
• For legal programs, front end produces IR and
preliminary storage map for the various data structures
declared in the program.

5
Front End

• The front end consists of two modules:


1. Scanner
2. Parser

6
Scanner

• The scanner takes a program as input and maps the


character stream into “words” that are the basic unit of
syntax.
• It produces pairs – a word and its part of speech.

7
Scanner

• For example,
the input
x = x + y
becomes
<id,x>
<assign,=>
<id,x>
<op,+>
<id,y> 8
Scanner

• We call the pair “<token type, word>” a token.


• Typical tokens are: number, identifier, +,
- , new, while, if.

9
Parser

• The parser takes in the stream of tokens, recognizes


context- free syntax and reports errors.
• It guides context-sensitive (“semantic”) analysis for
tasks like type checking.
• The parser builds IR for source program.

10
Context Free Grammars

• The syntax of most programming languages is


specified using Context-Free Grammars (CFG).
• Context- free syntax is specified with a grammar
G=(S,N,T,P) where
• S is the start symbol
• N is a set of non-terminal symbols
• T is set of terminal symbols or words
• P is a set of productions or rewrite rules

11
Context Free Grammars

• For example, the Context-Free Grammar for arithmetic


expressions is
1. goal → expr
2. expr → expr op term
3. | term
4. term → number
5. | id
6. op → +
7. | –
12
CFG

For This CFG


• S = goal
• T = { number, id, +, -}
• N = { goal, expr, term, op}
• P = { 1, 2, 3, 4, 5, 6, 7}

13
Context Free Grammar

• Given a CFG, we can derive sentences by repeated


substitution.
• Consider the sentence
x+2–y

14
Context Free Grammars

• For example, the Context-Free Grammar for arithmetic


expressions is
1. goal → expr
2. expr → expr op term
3. | term
4. term → number
5. | id
6. op → +
7. | –
15
Derivation
Production Result
goal

1. goal → expr expr

2. expr → expr op term expr op term

5. term → id expr op y

7. op → - expr - y

2. expr → expr op term expr op term - y

4. term → number expr op 2 - y

6. op → + expr + 2 - y

3. expr → term term + 2 - y

5. term → id x + 2 - y
16
Front End

• To recognize a valid sentence in some CFG, we reverse


this process and build up a parse, thus the name
“parser”.
• A parse can be represented by a tree: parse tree or
syntax tree.

17
Parse

18
Parse Tree

• A parse can be represented by a tree: parse tree or syntax


tree. For example, here is the parse tree for the expression
x+2–y

19
Parse Tree

• The parse tree captures all rewrite during the


derivation.
• The derivation can be extracted by starting at the root
of the tree and working towards the leaf nodes.

20
Abstract Syntax Tree

• The parse tree contains a lot of unneeded information.


Compilers often use an abstract syntax tree (AST).
• For example, the AST for the above parse tree is

21
Abstract Syntax Tree

• This is much more concise.


• AST summarizes grammatical structure without the
details of derivation.
• ASTs are one kind of intermediate representation
(IR).

22

You might also like