0% found this document useful (0 votes)
19 views15 pages

Document 1

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 15

What is Translators?

Different
type of translators
A program written in high-level language is called as source code. To convert the source code
into machine code, translators are needed.
A translator takes a program written in source language as input and converts it into a program
in target language as output.
It also detects and reports the error during translation.
Roles of translator are:
• Translating the high-level language program input into an equivalent machine language
program.
• Providing diagnostic messages wherever the programmer violates specification of the high-
level language program.

Different type of translators


The different types of translator are as follows:

Compiler
Compiler is a translator which is used to convert programs in high-level language to low-level
language. It translates the entire program and also reports the errors in source program
encountered during the translation.

Interpreter
Interpreter is a translator which is used to convert programs in high-level language to low-level
language. Interpreter translates line by line and reports the error once it encountered during
the translation process.
It directly executes the operations specified in the source program when the input is given by
the user.
It gives better error diagnostics than a compiler.

Differences between compiler and interpreter


SI. No Compiler Interpreter

1 Performs the translation of a program as a Performs statement by statement


whole. translation.

2 Execution is faster. Execution is slower.

3 Requires more memory as linking is needed Memory usage is efficient as no


for the generated intermediate object code. intermediate object code is generated.

4 Debugging is hard as the error messages are It stops translation when the first error is
generated after scanning the entire program met. Hence, debugging is easy.
only.

5 Programming languages like C, C++ uses Programming languages like Python,


compilers. BASIC, and Ruby uses interpreters.

Assembler

Assembler is a translator which is used to translate the assembly language code into machine
language code.

Introduction to Compiler

o A compiler is a translator that converts the high-level language into the machine language.
o High-level language is written by a developer and machine language can be understood
by the processor.
o Compiler is used to show errors to the programmer.
o The main purpose of compiler is to change the code written in one language without
changing the meaning of the program.
o When you execute a program which is written in HLL programming language then it
executes into two parts.
o In the first part, the source program compiled and translated into the object program (low
level language).
o In the second part, object program translated into the target program through the
assembler.
Fig: Execution process of source program in Compiler

Compiler Phases
The compilation process contains the sequence of various phases. Each phase takes source
program in one representation and produces output in another representation. Each phase takes
input from its previous stage.

There are the various phases of compiler:


Fig: phases of compiler

Lexical Analysis:

Lexical analyzer phase is the first phase of compilation process. It takes source code as input. It
reads the source program one character at a time and converts it into meaningful lexemes. Lexical
analyzer represents these lexemes in the form of tokens.

Syntax Analysis

Syntax analysis is the second phase of compilation process. It takes tokens as input and generates
a parse tree as output. In syntax analysis phase, the parser checks that the expression made by the
tokens is syntactically correct or not.
Semantic Analysis

Semantic analysis is the third phase of compilation process. It checks whether the parse tree
follows the rules of language. Semantic analyzer keeps track of identifiers, their types and
expressions. The output of semantic analysis phase is the annotated tree syntax.

Intermediate Code Generation

In the intermediate code generation, compiler generates the source code into the intermediate
code. Intermediate code is generated between the high-level language and the machine language.
The intermediate code should be generated in such a way that you can easily translate it into the
target machine code.

Code Optimization

Code optimization is an optional phase. It is used to improve the intermediate code so that the
output of the program could run faster and take less space. It removes the unnecessary lines of the
code and arranges the sequence of statements in order to speed up the program execution.

Code Generation

Code generation is the final stage of the compilation process. It takes the optimized intermediate
code as input and maps it to the target machine language. Code generator translates the
intermediate code into the machine code of the specified computer.

Example:
Formal grammar

o Formal grammar is a set of rules. It is used to identify correct or incorrect strings of tokens
in a language. The formal grammar is represented as G.
o Formal grammar is used to generate all possible strings over the alphabet that is
syntactically correct in the language.
o Formal grammar is used mostly in the syntactic analysis phase (parsing) particularly during
the compilation.

Formal grammar G is written as follows:

1. G = <V, N, P, S>

Where:

N describes a finite set of non-terminal symbols.


V describes a finite set of terminal symbols.
P describes a set of production rules
S is the start symbol.

Example:

1. L = {a, b}, N = {S, R, B}

Production rules:

1. S = bR
2. R = aR
3. R = aB
4. B=b

Through this production we can produce some strings like: bab, baab, baaab etc.

This production describes the string of shape banab.


Fig: Formal grammar

Finite state machine

o Finite state machine is used to recognize patterns.


o Finite automata machine takes the string of symbol as input and changes its state
accordingly. In the input, when a desired symbol is found then the transition occurs.
o While transition, the automata can either move to the next state or stay in the same state.
o FA has two states: accept state or reject state. When the input string is successfully
processed and the automata reached its final state then it will accept.

A finite automata consists of following:

Q: finite set of states


∑: finite set of input symbol
q0: initial state
F: final state
δ: Transition function

Transition function can be define as

1. δ: Q x ∑ →Q

FA is characterized into two ways:

1. DFA (finite automata)


2. NDFA (non deterministic finite automata)

DFA

DFA stands for Deterministic Finite Automata. Deterministic refers to the uniqueness of the
computation. In DFA, the input character goes to one state only. DFA doesn't accept the null move
that means the DFA cannot change state without any input character.

DFA has five tuples {Q, ∑, q0, F, δ}


Q: set of all states
∑: finite set of input symbol where δ: Q x ∑ →Q
q0: initial state
F: final state
δ: Transition function

Example

See an example of deterministic finite automata:

1. Q = {q0, q1, q2}


2. ∑ = {0, 1}
3. q0 = {q0}
4. F = {q3}

NDFA

NDFA refer to the Non Deterministic Finite Automata. It is used to transit the any number of states
for a particular input. NDFA accepts the NULL move that means it can change state without
reading the symbols.

NDFA also has five states same as DFA. But NDFA has different transition function.

Transition function of NDFA can be defined as:

δ: Q x ∑ →2Q

Example

See an example of non deterministic finite automata:

1. Q = {q0, q1, q2}


2. ∑ = {0, 1}
3. q0 = {q0}
4. F = {q3}
Bootstrapping

o Bootstrapping is widely used in the compilation development.


o Bootstrapping is used to produce a self-hosting compiler. Self-hosting compiler is a type
of compiler that can compile its own source code.
o Bootstrap compiler is used to compile the compiler and then you can use this compiled
compiler to compile everything else as well as future versions of itself.

A compiler can be characterized by three languages:

1. Source Language
2. Target Language
3. Implementation Language

The T- diagram shows a compiler SCIT for Source S, Target T, implemented in I.

Follow some steps to produce a new language L for machine A:

1. Create a compiler SCAA for subset, S of the desired language, L using language "A" and that
compiler runs on machine A.

2. Create a compiler LCSA for language L written in a subset of L.


3. Compile LCSA using the compiler SCAA to obtain LCAA. LCAA is a compiler for language L, which runs
on machine A and produces code for machine A.

The process described by the T-diagrams is called bootstrapping.

LEX

o Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
o The lexical analyzer is a program that transforms an input stream into a sequence of
tokens.
o It reads the input stream and produces the source code as output through implementing
the lexical analyzer in the C program.

The function of Lex is as follows:

o Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs
the lex.1 program and produces a C program lex.yy.c.
o Finally C compiler runs the lex.yy.c program and produces an object program a.out.
o a.out is lexical analyzer that transforms an input stream into a sequence of tokens.
Lex file format

A Lex program is separated into three sections by %% delimiters. The formal of Lex source is as
follows:

1. { definitions }
2. %%
3. { rules }
4. %%
5. { user subroutines }

Definitions include declarations of constant, variable and regular definitions.

Rules define the statement of form p1 {action1} p2 {action2}....pn {action}.

Where pi describes the regular expression and action1 describes the actions what action the
lexical analyzer should take when pattern pi matches a lexeme.

User subroutines are auxiliary procedures needed by the actions. The subroutine can be loaded
with the lexical analyzer and compiled separately.

YACC

o YACC stands for Yet Another Compiler Compiler.


o YACC provides a tool to produce a parser for a given grammar.
o YACC is a program designed to compile a LALR (1) grammar.
o It is used to produce the source code of the syntactic analyzer of the language produced
by LALR (1) grammar.
o The input of YACC is the rule or grammar and the output is a C program.

These are some points about YACC:

Input: A CFG- file.y

Output: A parser y.tab.c (yacc)

o The output file "file.output" contains the parsing tables.


o The file "file.tab.h" contains declarations.
o The parser called the yyparse ().
o Parser expects to use a function called yylex () to get tokens.

The basic operational sequence is as follows:

This file contains the desired grammar in YACC format.

It shows the YACC program.


It is the c source program created by YACC.

C Compiler

Executable file that will parse grammar given in gram.Y

Ambiguity
A grammar is said to be ambiguous if there exists more than one leftmost derivation or more than
one rightmost derivative or more than one parse tree for the given input string. If the grammar is
not ambiguous then it is called unambiguous.
Example:

1. S = aSb | SS
2. S=∈

For the string aabb, the above grammar generates two parse trees:

If the grammar has ambiguity then it is not good for a compiler construction. No method can
automatically detect and remove the ambiguity but you can remove ambiguity by re-writing the
whole grammar without ambiguity.

You might also like