0% found this document useful (0 votes)
50 views48 pages

Intermediate Code Generator 1

This document provides an overview of intermediate code generation in a compiler. It discusses various intermediate representations like three-address code and static single assignment form. It covers topics like constructing syntax trees as DAGs, generating three-address code from expressions, handling types and declarations, type checking, and control flow. The key aspects covered are variants of syntax trees, three-address code forms and representations, static single assignment form, type expressions and checking, translating expressions and statements to intermediate code, and handling control flow.

Uploaded by

Lavanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views48 pages

Intermediate Code Generator 1

This document provides an overview of intermediate code generation in a compiler. It discusses various intermediate representations like three-address code and static single assignment form. It covers topics like constructing syntax trees as DAGs, generating three-address code from expressions, handling types and declarations, type checking, and control flow. The key aspects covered are variants of syntax trees, three-address code forms and representations, static single assignment form, type expressions and checking, translating expressions and statements to intermediate code, and handling control flow.

Uploaded by

Lavanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Compiler course

Chapter 6
Intermediate Code Generation
Outline
Variants of Syntax Trees
Three-address code
Types and declarations
Translation of expressions
Type checking
Control flow
Backpatching
Introduction
Intermediate code is the interface between front end
and back end in a compiler
Ideally the details of source language are confined to
the front end and the details of target machines to the
back end (a m*n model)
In this chapter we study intermediate representations,
static type checking and intermediate code generation

Static Intermediate Code


Parser
Checker Code Generator Generator
Front end Back end
Variants of syntax trees
It is sometimes beneficial to crate a DAG instead of
tree for Expressions.
This way we can easily show the common sub-
expressions and then use that knowledge during code
generation
Example: a+a*(b-c)+(b-c)*d

+ *

*
d
a -

b c
SDD for creating DAG’s
Production Semantic Rules
1) E -> E1+T E.node= new Node(‘+’, E1.node,T.node)
2) E -> E1-T E.node= new Node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id, id.entry)
6) T -> num T.node = new Leaf(num, num.val)
Example:
1) p1=Leaf(id, entry-a) 8) p8=Leaf(id,entry-b)=p3
2) P2=Leaf(id, entry-a)=p1 9) p9=Leaf(id,entry-c)=p4
3) p3=Leaf(id, entry-b) 10) p10=Node(‘-’,p3,p4)=p5
4) p4=Leaf(id, entry-c) 11) p11=Leaf(id,entry-d)
5) p5=Node(‘-’,p3,p4) 12) p12=Node(‘*’,p5,p11)
6) p6=Node(‘*’,p1,p5) 13) p13=Node(‘+’,p7,p12)
7) p7=Node(‘+’,p1,p6)
Value-number method for
constructing DAG’s
= id To entry for i
num 10
+ + 1 2
3 1 3
i 10

Algorithm
Search the array for a node M with label op, left child l
and right child r
If there is such a node, return the value number M
If not create in the array a new node N with label op, left
child l, and right child r and return its value
We may use a hash table
Three address code
In a three address code there is at most one operator at
the right side of an instruction
Example:

+
t1 = b – c
+ * t2 = a * t1
t3 = a + t2
* t4 = t1 * d
d
t5 = t3 + t4
a -

b c
Forms of three address
instructions
 x = y op z
 x = op y
x = y
 goto L
 if x goto L and ifFalse x goto L
 if x relop y goto L
 Procedure calls using:
 param x
 call p,n
 y = call p,n
 x = y[i] and x[i] = y
 x = &y and x = *y and *x =y
Example
do i = i+1; while (a[i] < v);

L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100

Symbolic labels Position numbers


Data structures for three
address codes
Quadruples
Has four fields: op, arg1, arg2 and result
Triples
Temporaries are not used and instead references to
instructions are made
Indirect triples
In addition to triples we use a list of pointers to triples
Three address code
Example t1 = minus c
t2 = b * t1
a = b * minus c + b * minus c t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5

Quadruples Triples Indirect Triples


op arg1 arg2 result op arg1 arg2 op op arg1 arg2
minus c t1 0 minus c 35 (0) 0 minus c
* b t1 t2 1 * b (0) 36 (1) 1 * b (0)
minus c t3 2 minus c 37 (2) 2 minus c
* b t3 t4 3 * b (2) b (2)
38 (3) 3 *
+ t2 t4 t5 4 + (1) (3) 39 (4) 4 + (1) (3)
= t5 a 5 = a (4) 40 (5) 5 = a (4)
Static Single- Assignment Form
Static single-assignment form (SSA) is an
intermediate representation that facilitates certain
code optimizations.
Two distinctive aspects distinguish SSAfrom three-
address code.
The first is that all assignments in SSA are to variables
with distinct names; hence the term static single-
assignment.
• The same variable may be defined in two different
control-flow paths in a program.
For example, the source program
if ( f l a g ) x = -1; else x = 1;
y=x*a;
has two control-flow paths in which the variable x gets
defined.
• The second distinctive aspect of SSA. SSA uses a
notational convention called the ø-function to combine
the two definitions of x:
i f ( f l a g ) xl = -1; else x2 = 1;
x3 = ø (x1,x2);
Type Expressions
Example: int[2][3]
array(2,array(3,integer))

 A basic type is a type expression


 A type name is a type expression
 A type expression can be formed by applying the array type
constructor to a number and a type expression.
 A record is a data structure with named field
 A type expression can be formed by using the type constructor g for
function types
 If s and t are type expressions, then their Cartesian product s*t is a
type expression
 Type expressions may contain variables whose values are type
expressions
Type Equivalence
They are the same basic type.
They are formed by applying the same constructor to
structurally equivalent types.
One is a type name that denotes the other.
Declarations
Storage Layout for Local Names
Computing types and their widths
Storage Layout for Local Names
Syntax-directed translation of array types
Sequences of Declarations

Actions at the end:



Fields in Records and Classes


Translation of Expressions and
Statements
We discussed how to find the types and offset of
variables
We have therefore necessary preparations to discuss
about translation to intermediate code
We also discuss the type checking
Three-address code for expressions
Incremental Translation
Addressing Array Elements
Layouts for a two-dimensional array:
Semantic actions for array reference
Translation of Array References

Nonterminal L has three synthesized


attributes:
L.addr
L.array
L.type
Type Checking
• Type checking has the potential for catching errors
in programs.
• A sound type system eliminates the need for
dynamic checking for type errors.
• An implementation of a language is strongly typed
if a compiler guarantees that the programs it
accepts will run without type errors
Rules for Type Checking
Rules for Type Checking:
• Type Synthesis :builds up the type of an
expression from the types of its subexpressions.
• It requires names to be declared before they are
used.
• The type of El + E2 is defined in terms of the types
of El and E2.
• A typical rule for type synthesis has the form
if f has type s → t and x has type s, then expression f
(x) has type t
Type inference: determines the type of a language
construct from the way it is used.
• Eg: let null be a function that tests whether a list is
empty. Then, from the usage null(x), we can tell
that x must be a list. The type of the elements of x is
not known; all we know is that x must be a list of
elements of some type that is presently unknown.
• A typical rule for type inference has the form
if f (x) is an expression,then for some α and , β, f has
type α → β and x has type α.
Type inference is needed for languages like ML, which
check types, but do not require names to be declared.
Type Conversions
Type conversion rules vary from language to language.
• widening conversions, which are intended to
preserve information, and narrowing conversions, which
can lose information.
• Conversion from one type to another is said to be
implicit if it is done automatically by the compiler.
Implicit type conversions, also called coercions, are
limited in many languages to widening conversions
• Conversion is said to be explicit if the programmer
must write something to cause the
conversion.Explicit conversions are also called casts.
Conversions between primitive
types in Java
Introducing type conversions into
expression evaluation
Abstract syntax tree for the
function definition
fun length(x) =
if null(x) then 0 else length(tl(x)+1)

This is a polymorphic function


in ML language
Inferring a type for the function length
Algorithm for Unification
Unification algorithm
boolean unify (Node m, Node n) {
s = find(m); t = find(n);
if ( s = t ) return true;
else if ( nodes s and t represent the same basic type ) return true;
else if (s is an op-node with children s1 and s2 and
t is an op-node with children t1 and t2) {
union(s , t) ;
return unify(s1, t1) and unify(s2, t2);
}
else if s or t represents a variable {
union(s, t) ;
return true;
}
else return false;
}
Control Flow
boolean expressions are often used to:
Alter the flow of control.
Compute logical values.
Short-Circuit Code


Flow-of-Control Statements
Syntax-directed definition
Generating three-address code for booleans
translation of a simple if-statement


Backpatching
 Previous codes for Boolean expressions insert symbolic labels for
jumps
 It therefore needs a separate pass to set them to appropriate addresses
 We can use a technique named backpatching to avoid this
 We assume we save instructions into an array and labels will be indices
in the array
 For nonterminal B we use two attributes B.truelist and B.falselist
together with following functions:
 makelist(i): create a new list containing only I, an index into the array
of instructions
 Merge(p1,p2): concatenates the lists pointed by p1 and p2 and returns a
pointer to the concatenated list
 Backpatch(p,i): inserts i as the target label for each of the instruction
on the list pointed to by p
Backpatching for Boolean Expressions


Backpatching for Boolean Expressions
Annotated parse tree for x < 100 || x > 200 && x ! = y
Flow-of-Control Statements
Translation of a switch-statement
Readings
Chapter 6 of the book

You might also like