Code Generation (Autosaved)

Code Generation
Introduction
• Takes as input intermediate representation
(IR) produced by the font end of the compiler,
along with relevant symbol table information,
& produces as output a semantically
equivalent target program
Primary tasks
• Instruction selection
• Register allocation & assignment
• Instruction ordering
• Instruction selection involves choosing appropriate
target-machine instructions to implement the IR
statements.
• Register allocation & assignment involves deciding what
values to keep in which registers.
• Instruction ordering involves deciding in what order to
schedule the execution of instructions
Issues in the Design of a Code Generator
1. Input to the Code Generator
• The input to the code generator is the intermediate representation
of the source program produced by the front end, along with
information in the symbol table that is used to determine the run-
time addresses of the data objects denoted by the names in the IR .
• The many choices for the IR include
 three-address representations such as quadruples, triples, indirect
triples;
 virtual machine representations such as bytecodes and stack-
machine code;
 linear representations such as postfix notation; & graphical
representations such as syntax trees and DAG's.
2. The Target Program
• The instruction-set architecture of the target machine has a significant impact
on the difficulty of constructing a good code generator that produces high-
quality machine code.
• The most common target-machine architectures are RISC (reduced instruction
set computer) , CISC (complex instruction set computer) , and stack based.
A RISC machine typically has many registers, three-address instructions, simple
addressing modes, and a relatively simple instruction-set architecture.
In contrast, a ClSC machine typically has few registers, two-address instructions ,
a variety of addressing modes , several register classes, variable-length
instructions, and instructions with side effects .
In a stack-based machine, operations are done by pushing operands onto a stack
and then performing the operations on the operands at the top of the stack. To
achieve high performance the top of the stack is typically kept in registers. Stack-
based machines almost disappeared because it was felt that the stack
organization was too limiting and required too many swap and copy operations.
Different Types of Target Program
• Producing an absolute machine-language program as output has the advantage that
it can be placed in a fixed location in memory and immediately executed. Programs
can be compiled and executed quickly.
• Producing a relocatable machine-language program (often called an object module) as
output allows subprograms to be compiled separately. A set of relocatable object
modules can be linked together and loaded for execution by a linking loader. Although
we must pay the added expense of linking and loading if we produce relocatable
object modules, we gain a great deal of flexibility in being able to compile subroutines
separately & to call other previously compiled programs from an object module. If the
target machine does not handle relocation automatically, the compiler must provide
explicit relocation information to the loader to link the separately compiled program
modules.
• Producing an assembly-language program as output makes the process of code
generation somewhat easier. We can generate symbolic instructions & use the macro
facilities of the assembler to help generate code. The price paid is the assembly step
after code generation.
3. Instruction Selection
• The code generator must map the IR program into a code sequence that
can be executed by the target machine.
• The complexity of performing this mapping is determined by a factors
such as
 the level of the IR
 the nature of the instruction-set architecture
 the desired quality of the generated code
• If the IR is high level, the code generator may translate each IR statement
into a sequence of machine instructions using code templates.
• Such statement-by-statement code generation, however, often produces
poor code that needs further optimization.
• If the IR reflects some of the low-level details of the underlying machine,
then the code generator can use this information to generate more
efficient code sequences.
Affecting Factors
• The nature of the instruction set of the target machine has a strong
effect on the difficulty of instruction selection. For example, the
uniformity and completeness of the instruction set are important
factors. If the target machine does not support each data type in a
uniform manner, then each exception to the general rule requires
special handling. On some machines, for example, floating-point
operations are done using separate registers
• Instruction speeds & machine idioms are other important factors. If
we do not care about the efficiency of the target program,
instruction selection is straightforward. For each type of three-
address statement, we can design a code skeleton that defines the
target code to be generated for that construct.
• For example, every three-address statement of
the form x = y + z, where x, y, & z are statically
allocated, can be translated into the code
sequence:
This strategy often produces redundant loads and stores. For

example, the sequence of three-address statements:
a=b+c
d=a+e
Would be translated into
This strategy often produces redundant loads and stores. For
example, the sequence of three-address statements:
a=b+c
d=a+e
Would be translated into
Here, the fourth statement is redundant

since it loads a value that has just been
stored, and so is the third if a is not
subsequently used.
The quality of the generated code is usually determined by its
speed & size.
On most machines, a given IR program can be implemented by
many different code sequences, with significant cost differences
between the different implementations.
A naive translation of the intermediate code may therefore lead
to correct but unacceptably inefficient target code.
For example, if the target machine has an "increment"
instruction (INC) , then the three-address statement a = a + 1 may
be implemented more efficiently by the single instruction INC a,
rather than by a more obvious sequence that loads a into a
register, adds one to the register, and then stores the result back
into a:
4. Register Allocation
• A key problem in code generation is deciding what
values to hold in what registers.
• Registers are the fastest computational unit on the
target machine, but we usually do not have enough of
them to hold all values.
• Values not held in registers need to reside in memory.
• Instructions involving register operands are invariably
shorter & faster than those involving operands in
memory, so efficient utilization of registers is
particularly important.
The use of registers is often subdivided into
two subproblems:
• 1. Register allocation, during which we select
the set of variables that will reside in registers
at each point in the program.
• 2. Register assignment, during which we pick
the specific register that a variable will reside in.
• Finding an optimal assignment of registers to
variables is difficult, even with single-register
machines.
5. Evaluation Order
• The order in which computations are
performed can affect the efficiency of the
target code.
• Some computation orders require fewer
registers to hold intermediate results than
others.
• .
A Simple Target Machine Model
• Our target computer models a three-address machine
with load & store operations, computation operations,
jump operations, & conditional jumps.
• The underlying computer is a byte-addressable machine
with n general-purpose registers, R0, R1, . . . , Rn-1 .
• A full-fledged assembly language would have scores of
instructions.
• Most instructions consists of an operator, followed by a
target, followed by a list of source operands.
• A label may precede an instruction.
Basic Instructions
Addressing Modes
Program & Instructions Costs
Cost = 1 + costs associated with
addressing modes of the operands
• This cost corresponds to the length in words of
the instruction.
• Addressing modes involving:
 registers have zero additional cost,
 a memory location or constant have an
additional cost of one
• because such operands have to be stored in the words following the
instruction
Examples
More Examples
Instructions Cost/instruction Cost
MOV b, R0 1+1 =2
ADD c, R0 1 + 1 =2 Total=06
MOV R0, a 1 + 1 =2
MOV b, a 1 + 2 =3
ADD c, a 1+2=3 Total = 06
MOV *R1, *R0 1+0=1
ADD *R2, *R0 1+0=1 Total = 02
ADD R2, R1 1+0=1
MOV R1, a 1+1=2 Total = 03
Basic Blocks and Flow Graphs
• Construction Rules:
Basic Blocks
• First job is to partition a sequence of three-
address instructions into basic blocks.
• Begin a new basic block with the first
instruction & keep adding instructions until we
meet either a jump, a conditional jump, or a
label on the following instruction.
• In the absence of jumps and labels, control
proceeds sequentially from one instruction to
the next.
Intermediate code to set
a 10 x 10 matrix to an
identity matrix
Example: Consider the fragment of source code.
It computes the dot product of two vectors a & b of
length 20.
B1 (1) prod : = 0
begin (2) i := 1
prod := 0; (3) t1 := 8*i
i := 1; (4) t2 := a[t1]
do begin (5) t3 := 8*i B2
prod := prod + a [i] * b [i]; (6) t4 := b[t3]
i = i +1;
(7) t5 := t2*t4
end
(8) t6 := prod +t5
while i <= 20
end (9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
Flow Graphs
(1) prod : = 0
(2) i := 1
B1
(3) t1 := 8*i
(4) t2 := a[t1]
(5) t3 := 8*i
(6) t4 := b[t3]
(7) t5 := t2*t4 B2
(8) t6 := prod +t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
• First, instruction 1 is a leader by rule (1)
• To find the other leaders, we first need to find the
jumps.
 03 jumps: all conditional, at instructions 9, 11, 17
• By rule (2) , the targets of these jumps are leaders;
instructions 3, 2, 13
• Then, by rule (3) , each instruction following a jump
is a leader; those are instructions 10, 12
• Note that no instruction follows 17 in this code, but if
there were code following, the 18th instruction
would also be a leader.
• Leaders: instructions 1, 2, 3, 10, 12, 13.
• The basic block of each leader contains all the
instructions from itself until just before the next
leader.
basic block of 1 is just 1
Leader 2: block is just 2
Leader 3: instructions 3 through 9
Leader 10: blocks 10 & 11
Leader 12: block is just 12
Leader 13: blocks 13 through 17
Flow Graphs
• Once an intermediate-code program is partitioned into basic
blocks, we represent the flow of control between them by a flow
graph.
• The nodes of the flow graph are the basic blocks.
• There is an edge from block B to block C if and only if it is possible
for the first instruction in block C to immediately follow the last
instruction in block B .
• There are two ways that such an edge could be justified:
 There is a conditional or unconditional jump from the end of B to
the beginning of C .
 C immediately follows B in the original order of the 03-address
instructions, & B does not end in an unconditional jump
• We say that B is a predecessor of C , & C is a successor of B .
• Often we add two nodes, called the entry & exit, that do
not correspond to executable intermediate instructions.
• There is an edge from the entry to the first executable
node of the flow graph, that is, to the basic block that
comes from the first instruction of the intermediate code.
• There is an edge to the exit from any basic block that
contains an instruction that could be the last executed
instruction of the program.
• If the final instruction of the program is not an
unconditional jump, then the block containing the final
instruction of the program is one predecessor of the exit,
but so is any basic block that has a jump to code that is not
part of the program.
Example: The set of basic blocks
constructed in Example 8.6 yields
the flow graph of Fig. 8.9.
The entry points to basic block

B1, since B1 contains the first
instruction of the program
The only successor of B1 is B2 ,

because B1 does not end in an
unconditional jump, & the leader
of B2 immediately follows the end
of B1 .
Code Generator
• How to use registers to best advantage?
• There are four principal uses of registers:
In most machine architectures, some or all of the operands of an operation
must be in registers in order to perform the operation .
Registers make good temporaries - places to hold the result of a sub
expression while a larger expression is being evaluated, or more generally, a
place to hold a variable that is used only within a single basic block.
 Registers are used to hold (global) values that are computed in one basic
block and used in other blocks, for example, a loop index that is
incremented going around the loop and is used several times within the
loop .
Registers are often used to help with run-time storage management, for
example, to manage the run-time stack, including the maintenance of stack
pointers and possibly the top elements of the stack itself.
Assumptions
Machine instructions:
LD reg, mem
 ST mem, reg
OP reg, reg, reg
Descriptors
• Code generation algo uses descriptors to keep track of register
contents & addresses for names:
1. For each available register, a register descriptor keeps track of the
variable names whose current value is in that register. Since we shall
use only those registers that are available for local use within a basic
block, we assume that initially, all register descriptors are empty. As
the code generation progresses, each register will hold the value of
zero or more names.
2. For each program variable, an address descriptor keeps track of the
location or locations where the current value of that variable can be
found. The location might be a register, a memory address, a stack
location, or some set of more than one of these. The information
can be stored in the symbol-table entry for that variable name.
• The 1st call of getreg returns R0 as the location in which to
compute t.
• Since a is not in R0, we generate instructions MOV a, R0 &
SUB b, R0.
• Update the register descriptor to indicate that R0 contains t.
• Proceeds in this way until the last three-address statement
d:=v + u has been proceed.
• R1 becomes empty because u has no next use.
• Generate MOV R0, d to store the live variable d at the end
of the block
• COST= 2 + 2 + 2 + 2 + 1 + 1 + 2 =12
• Can be reduced to 11 (How??)

Code Generation (Autosaved)

Uploaded by

Copyright:

Available Formats

Code Generation (Autosaved)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Code Generation (Autosaved)

Uploaded by

Copyright:

Available Formats

Code Generation

This strategy often produces redundant loads and stores. For

Here, the fourth statement is redundant

The entry points to basic block

The only successor of B1 is B2 ,

You might also like