Lab Manual –Systems Programming & Operating Systems Lab Dept.
of Computer Engineering
ASSIGNMENT NO.01
TITLE: IMPLEMENTATION OF PASS – I OF TWO PASS ASSEMBLER
PROBLEM STATEMENT: Design suitable data structures and implement pass-I of a two-
pass assembler for pseudo-machine in Java using object oriented feature. Implementation
should consist of a few instructions from each category and few assembler directives.
OBJECTIVES:
• To study basic translation process of assembly language to machine language.
• To study two pass assembly process.
SOFTWARE & HARDWARE REQUIREMENTS:
1. 64-bit Open source Linux or its derivative
2. Eclipse
3. JDK
THEORY:
A language translator bridges an execution gap to machine language of computer system. An
assembler is a language translator whose source language is assembly language.
Language processing activity consists of two phases, Analysis phase and synthesis phase.
Analysis of source program consists of three components, Lexical rules, syntax rules and
semantic rules. Lexical rules govern the formation of valid statements in source language.
Semantic rules associate the formation meaning with valid statements of language. Synthesis
phase is concerned with construction of target language statements, which have the same
meaning as source language statements. This consists of memory allocation and code
generation.
Function of Analysis and Synthesis Phase:
Analysis Phase: -
• Isolate the label operation code and operand fields of a statement.
• Enter the symbol found in label field (if any) and address of next available machine word
into symbol table.
• Validate the mnemonic operation code by looking it up in the mnemonics table.
• Determine the machine storage requirements of the statement by considering the
mnemonic operation code and operand fields of the statement.
International Institute of Information Technology, Hinjawadi, Pune. Page 1
Lab Manual –Systems Programming & Operating Systems Lab Dept. of Computer Engineering
• Calculate the address of the first machine word following the target code generated for this
statement (Location Counter Processing)
Synthesis Phase:
• Obtain the machine operation code corresponding to the mnemonic operation code by
searching the mnemonic table.
• Obtain the address of the operand from the symbol table.
• Synthesize the machine instruction or the machine form of the constant as the case may be.
Language Processor Pass: -
It is the processing of every statement in a source program or its equivalent representation to
perform language-processing function.
Assembly Language statements: -
There are three types of statements,
1] Imperative - An imperative statement indicates an action to be performed during the
execution of assembled program. Each imperative statement usually translates into one
machine instruction.
2] Declarative - Declarative statement e.g. DS reserves areas of memory and associates names
with them. DC constructs memory word containing constants.
3] Assembly directives- Assembler directives instruct the assembler to perform certain actions
during assembly of a program, e.g. START<constant> directive indicates that the first word of
the target program generated by assembler should be placed at memory word with address
<constant>
Design of a Two Pass Assembler: -
Tasks performed by the passes of two-pass assembler are as follows:
Pass I: -
1. Separate the symbol, mnemonic opcode and operand fields.
2. Build the symbol table and the literal table.
3. Perform LC processing.
4. Construct the intermediate representation code for every assembly language statement.
Pass II: -
Synthesize the target code by processing the intermediate code generated during
International Institute of Information Technology, Hinjawadi, Pune. Page 2
Lab Manual –Systems Programming & Operating Systems Lab Dept. of Computer Engineering
Data structures required for pass I:
• OPTAB – a table of mnemonic op codes
• Contains mnemonic op code, class and mnemonic info
• Class field indicates whether the op code corresponds to
Imperative Statement (IS), Declaration Statement (DL) or Assembler Directive (AD)
For IS, mnemonic info field contains the pair ( machine opcode, instruction length)
• SYMTAB - Symbol Table Contains address and length
• LOCCTR - Location Counter
• LITTAB – a table of literals used in the program Contains literal and address
• Literals are allocated addresses starting with the current value in LC and LC is
incremented, appropriately
List of hypothetical instructions:
Advanced Assembler Directives
• LTORG
• ORIGIN
• EQU
LTORG and Literal Pool
The LTORG directive, which stands for ‘origin for literals’, allows a programmer to specify
where literals should be placed. The assembler uses the following scheme for placement of
literals: When the use of a literal is seen in a statement, the assembler enters it into a literal
pool unless a matching literal already exists in the pool.
International Institute of Information Technology, Hinjawadi, Pune. Page 3
Lab Manual –Systems Programming & Operating Systems Lab Dept. of Computer Engineering
At every LTORG statement, as also at the END statement, the assembler allocates memory to
the literals of the literal pool and clears the literal pool. This way, a literal pool would contain
all literals used in the program since the start of the program or since the previous LTORG
statement. If a program does not use an LTORG statement, the assembler would enter all
literals used in the program into a single pool and allocate memory to them when it
encounters the END statement.
Advantages of Literal Pool
• Automatic organization of the literal data into sections that are correctly aligned and
arranged so that minimal space is wasted in the literal pool.
• Assembling of duplicate data into the same area.
ORIGIN Directive
The syntax of this directive is
ORIGIN <address specification>
Where <address specification> is an <operand specification> or <constant>
This directive instructs the assembler to put the address given by <address specification> in
the location counter. The ORIGIN statement is useful when the target program does not
consist of a single contagious area of memory.
EQU Directive
The EQU directive has the syntax
<symbol> EQU <address specification>
Where <address specification> is an <operand specification> or <constant>
The EQU statement simply associates the name <symbol> with the address specified by
<address specification>.
However the address in the location counter is not affected.
Algorithm (Pass I of Two – Pass Assembler)
1. LC := 0; (This is a default value)
littab_ptr := 1;
pooltab_ptr := 1;
POOLTABLE[1].first := 1;
2. While the next statement is not an END statement
(a) If a symbol is present in label field then
this_lable := symbol in label field;
Make an entry (this_label, <LC>, __) in SYMTAB.
(b) If an LTORG statement then
(i) If POOLTAB[pooltab_ptr].#literal > 0 then
International Institute of Information Technology, Hinjawadi, Pune. Page 4
Lab Manual –Systems Programming & Operating Systems Lab Dept. of Computer Engineering
Process the entries LITTAB. LITTAB[littab_ptr -1] to
allocate memory to literal, put address of allocated memory area in the
address field of the LITTAB entry, and the update the address combined
in location counter accordingly.
(ii) pooltab_ptr : = pooltab_ptr + 1
(iii) POOLTAB[pooltab_ptr].first := littab_ptr,
POOLTAB[pooltab_ptr].#literals := 0;
(c) If a START or ORIGIN statement then
LC : = value specified in operand field;
(d) If an EQU statement then
(i) this_addr : = value of <address specification>
(ii) Correct the SYMTAB entry for this_lable to (this_label, this_addr , 1 )
(e) If a declaration statement then
(i) Invoke the routine whose id is mentioned in the mnemonic info field.
(ii) If the symbol is present in the label field, correct SYMTAB entry for
this_label to (this_label, <LC>, Size)
(iii) LC := LC + 1;
(iv) Generate intermediate code for declaration statement.
(f) If an imperative statement then
(i) code : = machine code from mnemonic info field of OPTAB;
(ii) LC := LC + instruction length from length field of OPTAB;
(iii) If Operand is literal then
this_literal : = literal in operand field;
If this_literal does not match any literal in LITTAB then
LITTAB[littab_ptr].value := this_literal;
POOLTAB[pooltab_ptr].#literal = POOLTAB[pooltab_ptr].#literal +1;
littab_ptr := littab_ptr + 1;
else (i.e operand is a symbol)
this_entry := SYMTAB entry number of operand;
Generate intermediate code for the imperative statement.
3. Processing of END statement
(a) Perform actions (i) – (iii) of Step 2(b)
(b) Generate intermediate code for the END statement.
CONCLUSION:
International Institute of Information Technology, Hinjawadi, Pune. Page 5
Lab Manual –Systems Programming & Operating Systems Lab Dept. of Computer Engineering
FAQs
1. Explain what is meant by pass of an assembler.
2. Explain the need for two pass assembler.
3. Explain terms such as Forward Reference and backward reference.
4. Explain various types of errors that are handled in two different passes.
5. Explain the need of Intermediate Code generation and the variants used.
6. State various tables used and their significance in the design of two pass Assembler.
7. What are three types of assembly language statements?
8. List the features of the assemblers?
9. What is the use of location counter in assembler?
10. What is the use of symbol table in assembler?
International Institute of Information Technology, Hinjawadi, Pune. Page 6