CH1 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

TYPES OF COMPILERS

In order to convert the source code into machine language code, the compiler has the
types.

 NATIVE CODE: The compiler used to compile a source code for same type of
platform only.

 CROSS COMPILER: The compiler used to compile a source code for different kinds
of platforms. They produce an executable machine code for a platform.

 SOURCE TO SOURCE COMPILER: The compiler that takes high-level language


code as input and output source code of another high-level language only.

 BOOTSTRAP COMPILER: are written in a programming language that they have to


compile.

1
TYPES OF COMPILERS
 ONE PASS COMPILER: compiles whole process in a single pass.
 THREADED CODE COMPILER: It replace string by appropriate binary
code.
 INCREMENTAL COMPILER: It compiles only changed lines from source
code and update object code.
 SOURCE COMPILER: It converts source code high level language in
assembly language only.
 DECOMPILER: Basically, it is not a compiler. It is just the reverse of the
compiler. It converts the machine code into high-level language.

2
Uses/Application of Compilers
 Helps to make the code independent of the platform.

 Makes the code free of syntax and semantic errors.

 Generate executable files of code.

 Translates the code from one language to another.

 Used with other Software Productivity Tools.

 Widely used for Translating Programs.

 Support optimization for Computer Architecture Parallelism.

 Design of New Memory Hierarchies of Machines.

3
Features of a Compiler
The features are as follows:
 Compilation speed.
 The correctness of machine code.
 Preserve the correct the meaning of the code
 Speed of the target machine code.
 Good error detection/handling and reporting
 Recognize legal and illegal program constructs
 Checking the code correctly according to grammar.

4
Phases of Compiler Design

 Phase 1: Lexical Analysis

 Phase 2: Syntax Analysis

 Phase 3: Semantic Analysis

 Phase 4: Intermediate Code Generation

 Phase 5: Code Optimization

 Phase 6: Code Generation


5
Phases of Compiler Design
Source program

Lexical Analysis

Syntax Analysis

Semantic Analysis Error


Symbol handler
table
Intermediate code generation

Code optimization

Code generation

Assembly target code


6
Phases of Compiler Design

 Each phase:
 Transforms the source program from One representation
into another representation.
 They communicate with error handlers
 They communicate with the symbol table

7
Lexical Analysis
 Lexical Analysis is the first phase when compiler scans the source code.

 The tool used is called lexical analyzer (LA) or scanner (Scanner Generator).

 This process can be left to right, character by character, and group these characters into
tokens.

 LA is a program that groups sequences of characters into lexemes, and returns a


sequence of tokens

 Here, the character stream from the source program is grouped in meaningful sequences
by identifying the tokens.

 It makes the entry of the corresponding tickets into the symbol table and passes that
token to next phase.

8
Lexical Analysis

 The primary functions of this phase are:


 Identify the lexical units in a source code
 Classify lexical units into classes like constants, reserved
words, and enter them in different tables.
 Ignore comments in the source program
 Identify token which is not a part of the language

9
Lexical Analysis
Tokens can be classified as:

1. Identifier: - these are names chosen to represent data items, functions etc.
2. Keywords: - these are names chosen by computer language designer to
represent facts in a computer language. keywords cannot be used as identifiers.
3. Operators: - these are used to perform operations on operands.
4. Separators: - these are punctuation marks used to group together a sequence of
tokens that have a single meaning.
5. Comments: - increases understandability of a program.

10
Lexical Analysis
 A token describes a pattern of characters having
same meaning in the source program. (such as
identifiers, operators, keywords, numbers,
delimiters and so on)
 General Form: <token-name, attribute-value>
❖Token-name -> an abstract symbol that is used during syntax
analysis
❖Attribute-value-> entry in the symbol table for
this token.
11
Lexical Analysis
 Example:

x = y + 10

Tokens

X identifier

= Assignment operator

Y identifier

+ Addition operator

10 Number

12
Lexical Analysis
 position = initial + rate * 60

 <id,1><=> <id,2><+><id,3><*><60>

 position : mapped into a token (id, 1); Id - identifier and 1 points to the symbol- table

 = : mapped into the token <=>.

 initial : mapped into the token <id, 2>,

 + : mapped into the token <+>

 rate : mapped into the token <id, 3>

 * : mapped into the token <*>

 60 mapped into the token <60>

13
Lexical Analysis
 Generally Lexical Analysis…….
 Skips or ignores comments and white spaces out side of lexemes
 Identifies candidate tokens from the given lexemes ( from source
program)
 Put information about tokens in symbol table
 Detect lexical errors and report errors to the user

 Note:
 A (Deterministic) Finite State Automaton can be used in the
implementation of a lexical analyzer.
14
Syntax Analysis
 The tool used in syntax analysis is called Syntax Analyzer (SA) or parser

 The second phase of the compiler is syntax analysis or parsing

 A SA creates the syntactic structure (generally a parse tree) of the given


program.

 The parser or SA uses the first components of the tokens produced by the LA
to create a tree-like intermediate representation that depicts the grammatical
structure of the token stream.

 A typical representation is a syntax tree in which each interior node represents


an operation and the children of the node represent the arguments of the
operation.
15
Syntax Analysis
=
 Consider the example
+
 position = initial + rate * 60 <Id,1>
*
<id,2>
 The Syntax tree is in the form of
<id,3> 60

 The syntax of a language is specified by a context free grammar (CFG)

 The rules in a CFG are mostly recursive.

 a SA checks whether a given program satisfies the rules implied by a CFG


or not, If it satisfies the SA creates a parse tree for the given program
16
Syntax Analysis

 Here, is a list of tasks performed in this phase:


 Obtain tokens from the lexical analyzer
 Checks if the expression is syntactically correct or not
 Report all syntax errors
 Construct a hierarchical structure which is known as a parse
tree.

17
Syntax Analysis
 Depending on how the parse tree is created, there are different parsing
techniques

 These parsing techniques are categorized in to two groups:

Top-down parsing:

 construction of the parse tree starts at root , and proceeds towards the
leaves.

 Efficient top-down parsers can be easily constructed by hand

 Recursive descent parsing ,Recursive predictive parsing , non-


recursive predictive parsing are the techniques used
18
Syntax Analysis
Bottom-up Parsing

 Construction of the parse tree starts at the leaves, and


proceeds towards the root.

 Normally efficient bottom-up parsers are created with


the help of some software tools.

 Bottom-up parsing is also known as shift reduce


parsing.

19
Semantic Analysis
 Semantic analysis checks the semantic consistency of the code.
It uses the syntax tree of the previous phase along with the
symbol table to verify that the given source code is semantically
consistent.

 It also checks whether the code is conveying an appropriate


meaning.

 Semantic Analyzer will check for


 Type mismatches, incompatible operands, a function called with improper
arguments, an undeclared variable, etc.

20
Semantic Analysis
 Functions of Semantic analyses phase are
 Helps you to store type information gathered and save it in
symbol table.
 Allows you to perform type checking.
 In the case of type mismatch, where there are no exact type
correction rules which satisfy the desired operation a semantic
error is shown.
 Collects type information and checks for type compatibility
 Checks if the source language permits the operands or not

21
Semantic Analysis
 Example from the above expression:

position=initial + rate*60

 Suppose that: position, initial, and rate have been declared to


be floating-point numbers, and that the lexeme 60 by itself
forms an integer.
 The operator * is applied to a floating-point number rate and an integer 60
 The type checker in the semantic analyzer may convert the lexeme 60 into a
floating-point number
 The output has an extra node for the operator inttofloat, which explicitly
converts its integer argument into a floating-point number.

22
Semantic Analysis
 The output of the semantic analysis is the same with
syntax tree plus meaning associated with each expressions
appreciable.
=
+
<Id,1>
*
<id,2>
Inttofloat(60)
<id,3>

60.0
23
Semantic Analysis

 Example
 float x = 20.2;
 float y = x*30;

 In the above code, the semantic analyzer will


typecast the integer 30 to float 30.0 before
multiplication

24
Intermediate Code Generation
 Once the semantic analysis phase is over the compiler,
generates intermediate code for the target machine. It
represents a program for some abstract machine.

 Intermediate code is between the high-level and machine


level language.

 This intermediate code needs to be generated in such a


manner that makes it easy to translate it into the target
machine code.

25
Intermediate Code Generation

 Functions on Intermediate Code generation:


 It should be generated from the semantic representation of the
source program
 Holds the values computed during the process of translation.
 Helps you to translate the intermediate code into target language
 Allows you to maintain precedence ordering of the source
language
 It holds the correct number of operands of the instruction.

26
Intermediate Code Generation

 For example,
 Position = initial + rate * 60
Intermediate code with the help of address code method is:
 t1 := int_to_float(60)
 t2 := rate * t1
 t3 := initial + t2
 Position := t3

27
Code Optimization
 The next phase of is code optimization.

 This phase removes unnecessary code line and


arranges the sequence of statements to speed up the
execution of the program without wasting resources.

 The main goal of this phase is to improve on the


intermediate code to generate a code that runs faster
and occupies less space.
28
Code Optimization
 The primary functions of this phase are:
 It helps you to establish a trade-off between execution and
compilation speed
 Improves the running time of the target program
 Generates streamlined code still in intermediate
representation
 Removing unreachable code and getting rid of unused
variables
 Removing statements which are not altered from the loop

29
Code Optimization
 Example:

Consider the following code position= initial +rate*60

 a = intofloat(60)

 b=c*a

 d=e+b

 f=d

Can become

 b =c * 60.0

 f = e+b

30
Code Generation
 Code generation is the last and final phase of a compiler. It gets inputs from code
optimization phases and produces the page code or object code as a result.

 The objective of this phase is to allocate storage and generate relocatable machine
code.

 It also allocates memory locations for the variable.

 The instructions in the intermediate code are converted into machine instructions.

 This phase coverts the optimize or intermediate code into the target language.

 The target language is the machine code. Therefore, all the memory locations and
registers are also selected and allotted during this phase. The code generated by this
phase is executed to take inputs and generate expected outputs.

31
Code Generation
 Example:

 position = initial + rate*60.0


Id1=id2+id3*60.0
R0, R1, R2

 Would be possibly translated to registers.


 MOV id3, R2
 MUL #60.0, R2
 MOV id2, R1
 ADD R2, R1
 MOV id1, R0
 MOV R1, R0

32

You might also like