Chapter Four - Assembly Programming
Chapter Four - Assembly Programming
Chapter Four
Editing, Assembling and Linking an Assembly Program
Introduction
You write an assembly program according to a strict set of rules, use an editor or word
processor for keying it into the computer as a file, and then use the assembler translator
program to read the file and to convert it into machine code.
The two main classes of programming languages are high-level and low-level.
Programmers writing in a high-level language, such as C or BASIC, use powerful
commands, each of which may generate many machine language instructions.
Programmers writing in a low-level assembly language, on the other hand, code symbolic
instructions, each of which generates one machine instruction.
Despite the fact that coding in a high-level language is more productive, some advantages
to coding in assembly language are that it in general:
• Provides more control over handling particular hardware requirements.
• Generates smaller, more compact executable modules
• Results in faster execution
A linker program for both high- and low-levels completes the process by converting the
object code into executable machine language.
Assembly Language Features
The features of this language provide the basic rules and framework for the language.
Program Comments
The use of comments throughout a program can improve its clarity, especially in
assembly language, where the purpose of a set of instructions is often unclear.
A comment begins with a semicolon (;), and wherever you code it, the assembler assumes
that all characters on the line to its right are comments. A comment may contain any
printable character, including a blank.
A comment may appear on a line by itself, like this:
; Calculate productivity ratio
Or on the same line following an instruction, like this:
ADD AX, BX ; Accumulate total quantity
Because a comment appears only on a listing of an assembled source program and
generates no machine code, you may include any number of comments without affecting
the assembled program’s size or execution.
Reserved Words
Certain names in assembly language are reserved for their own purposes, to be used only
under special conditions. Reserved words, by category, include:
• Instructions, such as MOV, and ADD, which are operations that the computer can
execute.
• Directives, such as END or SEGMENT, which you use to provide information to
the assembler;
• Operators, such as FAR and SIZE, which you use in expressions; and
• Predefined Symbols, such as @Data and @Model, which return information to your
program during the assembly.
Using a reserved word for a wrong purpose causes the assembler to generate an error
message.
Identifiers
An identifier is a name that you apply to an item in your program that you expect to
reference. The two types of identifiers are name and label:
1. Name refers to the address of a data item, such as COUNTER in
COUNTER DB 0
2. Label refers to the address of an instruction, procedure, or segment, such as MAIN
and B30: in the following statements:
MAIN PROC FAR
B30: ADD BL, 25
The same rules apply to both names and labels. An identifier can use the following
characters:
Category Allowable Characters
Alphabetic letters: A through Z and a through z
Digits: 0 through 9 (not the first character)
Special characters: Question mark (?), break, or underscore ( _ ), dollar ($), at (@),
dot (.) (not first character)
The first character of an identifier must be an alphabetic letter or a special character,
except for the dot. Because the assembler uses some special words that begin with the @
symbol, you should avoid using it for your own definitions.
The names of registers, such as AH, BX, and DS, are reserved for referencing those
registers. Consequently, in an instruction such as ADD CX, BX the assembler knows that
CX and BX refer to registers. However, in an instruction such as MOV REGSAVE, CX the
assembler can recognize the name REGSAVE only if you define it as a data item.
Statements
An assembly program consists of a set of statements. The two types of statements are:
COMPILED BY: SAMUEL G 2
Computer Organization and Assembly Language Programming
1. Instructions, such as MOV and ADD, which the assembler translates to object
code; and
2. Directives, which tell the assembler to perform a specific action, such as define a
data item.
The format for a statement, where square brackets indicate an optional entry:
[Identifier] Operation [operand (s)] [; comment]
An identifier (if any), operation, and operand (if any) are separated by at least one blank
or tab character.
Examples of statements are:
Identifier Operation Operand Comment
Directive: COUNT DB 1 ; Name, operation, operand
Instruction: L30: MOV AX, 0 ; Label, operation, 2
operands
The identifier, operation, and operand may begin in any column. However, consistently
starting at the same column for these entries makes a more readable program. Also, many
editor programs provide tab stops every eight positions to facilitate spacing the fields.
The operation, which must be coded, is most commonly used for defining data areas and
coding instructions. For a data item, an operation such as DB or DW defines a field, work
area, or constant. For an instruction, an operation such as MOV or ADD indicates an
action to perform.
The operand (if any) provides information for the operation to act on. For a data item, the
operand defines its initial value. For example, in the following definition of a data item
named COUNTER, the operation DB means “define byte”, and the operand initializes its
contents with a zero value:
Name Operation Operand Comment
COUNTER DB 0 ; Define byte with initial 0 value
For an instruction, an operand indicates where to perform the action. An instruction’s
operand may contain one, two, or even no entries. Here are three examples:
Operation Operand Comment
RET ; Return from a procedure
INC BX ; Increment BX register by 1
ADD CX, 25 ; Add 25 to CX register
Directives
Assembly language supports a number of statements that enable you to control the way
in which a source program assembles and lists. These statements, called directives, act
only during the assembly of a program and generate no machine-executable code.
The most common directives are discussed as follows:
The PAGE and TITLE Listing Directives
These directives help to control the format of listings of an assembled program. At the
start of a program, the PAGE directive designates the maximum number of lines to list
on a page and the maximum number of characters on a line. Its format is
PAGE [Length] [, Width]
For example, for the directive PAGE 60, 132, length is 60 lines per page and width is 132
characters per line.
Under a typical assembler, the number of lines per page may range from 10 through 255,
and the number of characters per line may range from 60 through 132. Omission of a
PAGE statement causes the assembler to default to PAGE 50, 80.
You may also want to force to eject at a specific line in the program listing, such as the
end of a segment. At the required line, simply code PAGE with no operand. On
encountering PAGE, the assembler advances to the top of the next page where it resumes
the listing.
You can use the TITLE directive to cause a title for a program to print on line 2 of each
page of the program listing. You may code TITLE once, at the start of the program. Its
format is
TITLE text [comment]
For text, a common practice is to use the name of the program as cataloged on disk. For
example, if you named the program ASMSORT, code that name plus an optional
descriptive comment (a leading ‘;’ is not required), all up to 60 characters in length, like
this:
TITLE ASMSORT Assembly Program to sort CD titles
SEGMENT Directive
The directives for defining a segment, SEGMENT and ENDS, have the following format:
Name Operation Operand
Segment-name SEGMENT [align] [combine]
[‘class’]
…
Segment-name ENDS
The SEGMENT statement defines the start of a segment. The segment-name must be
present, must be unique, and must follow assembly language naming conventions. The
ENDS statement indicates the end of the segment and contains the same name as the
SEGMENT statement. The maximum size of a segment in real mode is 64K. The
SEGMENT statement may contain three types of options: alignment, combine, and class:
• The align option indicates the boundary on which the segment is to begin. The
typical requirement is PARA, which causes the segment to align on a paragraph
boundary so that the starting address is evenly divisible by 16, or 10H. Omission
of the align operand causes the assembler to default to PARA.
• The combine option indicates whether to combine the segment with other
segments when they are linked after assembly. Combine types are STACK,
COMMON, PUBLIC, and AT expression. For example, the stack segment is
commonly defined as
Segment-name SEGMENT PARA
STACK
o You may use PUBLIC and COMMON where you intend to combine
separately assembled programs when linking them. Otherwise, where a
program is not to be combined with other programs, you may omit this
option or code NONE.
• The class option, enclosed in apostrophes, is used to group related segments when
linking. The classes ‘code’ for the code segment, ‘data’ for the data segment, and
‘stack’ for the stack segment.
The following code illustrates SEGMENT statements with various options. Note that the
program defines a stack segment with alignment (PARA), combine (STACK), and class
(‘stack’) types.
page 60, 132
TITLE A04ASM1 Segments for an .EXE program
; ---------------------------------------------------------------------------------------
STACK SEGMENT PARA STACK ‘Stack’
….
STACK ENDS
;---------------------------------------------------------------------------------------
DATASEG SEGMENT PARA ‘Data’
…
DATASEG ENDS
;----------------------------------------------------------------------------------------
CODESEG SEGMENT PARA ‘Code’
the purpose of each segment in the program. The required directive is ASSUME, coded
in the code segment as follows:
ASSUME SS:stackname, DS:datasegname, CS:codesegname, …
Defining Types of Data
The assembler provides a set of directives that permits definitions of items by various
types and lengths; for example, DB defines byte and DW defines word. A data item may
contain an undefined (that is, uninitialized) value, or it may contain an initialized
constant, defined either as a character string or as a numeric value.
The format for data definition is:
[name] Dn Expression
• Name: - A program that references a data item does so by means of a name. The
name is otherwise optional, as indicated by the square brackets.
• Directive (Dn): - The directives that define data items are DB (byte), DW (word),
DD (doubleword), DF (farword), DQ (Quadword), and DT (tenbytes), each of
which explicitly indicates the length of the defined item.
• Expression: - The expression in an operand may specify an uninitialized value or
constant value. To indicate an uninitialized item, define the operand with a
question mark, such as
DATAX DB ? ; Uninitialized item
In this case, when your program begins execution, the initial value of DATAX is
unknown to you.
You can use the operand to define a constant, such as
DATAY DB 25 ; Initialized item
You can freely use this initialized value 25 throughout your program and can even
change the value.
An expression may contain multiple constants separated by commas and limited only by
the length of the line, as follows:
DATAZ DB 21, 22, 23, 24, 25
The assembler defines these constants in adjacent bytes, from left to right. A reference to
DATAZ is to the first 1-byte constant, 21, and a reference to DATAZ+1 is to the second
constant, 22.
For example, the instruction
MOV AL, DATAZ+3
Loads the value 24 (18H) into the AL register. The expression also permits duplication of
constants in a statement of the format.
[name] Dn repeat-count DUP (Expression) …
COMPILED BY: SAMUEL G 7
Computer Organization and Assembly Language Programming
• Hexadecimal: - uses the hex digits 0 through F, followed by the radix specifier H.
Because the assembler expects that a reference beginning with a letter is a symbolic
name, the first digit of a hex constant must be 0 to 9. Examples are 3DH and
0DE8H, which the assembler stores as 3D and (with bytes in reverse sequence)
E80D, respectively. Because the letters D and B act as both radix specifiers and hex
digits, they could conceivably cause some confusion.
• Real: - The assembler converts a given real value (a decimal or hex constant
followed by the radix specifier R) into floating-point format for use with a numeric
coprocessor.
Equate Directives
The assembler provides Equal-sign, and EQU directives for redefining symbolic names
with other names and numeric values with names. These directives do not generate any
data storage; that is, a program cannot, say, add to an EQU item when it executes. Instead,
the assembler uses the defined value to substitute in other statements.
The advantage of equate directives is that many statements may use the assigned value.
If the value has to be changed, you need change only the equate statement. The result is
a program that is more readable and easier to maintain;
The Equal-sign Directive: - enables you to assign the value of an expression to a name,
and may do so any number of times in a program. The following examples illustrate its
use:
VALUE_OF_PI = 3.1416
RIGHT_COL = 79
SCREEN_POSITIONS = 80 * 25
Examples of the use of the preceding directives are:
IMUL AX, VALUE_OF_PI ; Multiply AX by 3.1416
CMP BL, RIGHT_COL ; Compare BL to 79
MOV CX, SCREEN_POSITIONS ; Move 2000 to CX
When using this directive for defining a doubleword value, first use the .386 directive to
notify the assembler:
.386
DBLWORD1 = 42A3B05CH
The EQU Directive: - Consider the following EQU statement coded in the data segment:
FACTOR EQU 12
The name, in this case FACTOR, may be any name acceptable to the assembler. Now
whenever the word FACTOR appears in an instruction or another directive, the
assembler substitutes the value 12. For example, the assembler converts the directive
TABLEX DB FACTOR DUP(?)
To its equivalent value
TABLEX DB 12 DUP(?)
TABLEX may be defined by EQU only once so that it cannot be redefined by another
EQU. An instruction may also contain an equated operand, as in the following:
RIGHT_COL EQU 79
…
MOV CX, RIGHT_COL ; Move 79 to CX
You can also equate symbolic names, as in the following code:
ANNL_TEMP DW 0
…
AT EQU ANNL_TEMP
MPY EQU MUL
The first EQU equates the nickname AT to the defined item ANNL_TEMP. For any
instruction that contains the operand AT, the assembler replaces it with the address of
ANNL_TEMP. The second EQU enables a program to use the word MPY in place of the
regular symbolic instruction MUL.
Documentation
Stylistically, the most striking difference between program2 and program3 is the presence
of comments in program3. There are two ways to document a statement of source code.
A comment can either precede the statement on a separate line, or it can be appended to
the line on which the statement appears.
The first line of program2 could have been documented in either of the following
manners:
; list to screen
LIST SCR
Or
LIST SCR ; List to screen
The codes at lines 0010 and 0011 and at line 0036 and 0037 are called the boilerplate code.
The boilerplate code is code that is present in more or less the same form in every
assembly language program. Lines 0010 and 0011 set the ds register so that the program
can access the data segment, and lines 0036 and 0037 get processing control back to DOS
when the program concludes.
Reading the keyboard
DOS function $08 is invoked at lines 0014 and 0015. This function waits for an input from
the keyboard and returns the ASCII value of that input in the Al register. DOS function
$08 is invoked when an INT $21 instruction is executed with the value $08 stored in the
Ah register.
The instruction at line 0018 copies the contents of the Al register in to Bl register. This
operation is necessary because the program is going to use those contents in the call to
function $02 at lines 0026 through 0028. Before it can make that call, however, the contents
of the Al register will be contaminated (i.e. changed) by the call to function $09 at lines
0021 through 0023.
As a rule, the DOS function calls preserve the contents of all registers except for the Ax
register and any other register or registers in which they explicitly return data.
Consequently, the contents of Al register, which is part of the Ax register, will be
undefined after the execution of the INT $21 instruction at line 0023, but the contents of
the Bl register will be unaffected.
Composing output
Program3 produces its output in three separate steps. First, it outputs a message, “the
letter you typed was “, and second it outputs the character that the user typed. Third and
finally, it outputs two spaces and a period. The first and third parts of the output are
constant string images. They are generated with the same DOS function $09 that was used
in program3. The second part of the output uses DOS function $02 to output a single
character. To invoke DOS function $02, a program must execute an INT $21 instruction
with the character to be displayed contained in the Dl register and the value of $02
contained in the Ah register. The sequence of instructions at lines 0026 through 0028 does
just that.
The texts of the messages in program3 are enclosed in single quotes; whereas the text of
the message in program2 was enclosed in double quotes. This was done largely to make
the point that you can use either single or double quotes to enclose a string of text in
assembly language program. One advantage of this flexibility is that you can embed
single quotes with in text defined by double quotes and embed double quotes with in text
defined by single quotes:
Message_one db “the letter you typed was ‘X’ .$”
Or
Message_two db ‘the letter you typed was “X” .$’
are both permissible.
Analysis of program4
Prgram4 expands up on the task performed by program3. Program4 introduces a user
prompt that explicitly requests the user to type a letter at the keyboard. Then it displays
a two-line response that reads
The letter you typed was x
The letter after x is y
Program4 is designed to illustrate
• DOS function $01
• Multiline output
• The INC (INCrement) instruction.
Program4
0001 list scr
0002 ;**********program5.4************************************
0003 ;*Asks the user to input a letter from the keyboard *
0004 ;*and responds: *
0005 ;* "The letter you typed was x ." (CR/LF) *
0006 ;* "The letter after x is y ." *
0007 ;********************************************************
0008 hex $
0009 code segment
0010 ;set the DS register.
0011 0000 B8XXXX mov ax,data
0012 0003 8ED8 mov ds,ax
0013 ;display user promt
0014 0005 B409 mov ah, $09
0015 0007 BAXXXX mov dx, offset user_promt
0016 000A CD21 int $21
0017 ; read keyboard with echo
0018 000C B401 mov ah, $01
0019 000E CD21 int $21
0020 ;save input value
0021 0010 8AD8 mov bl, al
0022 ;generate CR/LF
0023 0012 B402 mov ah,$02
0024 0014 B20D mov dl,$0d
0025 0016 CD21 int $21
0026 0018 B402 mov ah,$02
0027 001A B20A mov dl,$0a
0028 001C CD21 int $21
0029 ;display first part of message
0030 001E B409 mov ah, $09
0031 0020 BAXXXX mov dx, offset message
0032 0023 CD21 int $21
0033 ;display contents of bl register
The code at lines 0014 through 0016 of program4 invokes DOS function $09 to display the
user prompt “type a letter, please”. This line appears on screen immediately below the
command line that invokes the program:
C:\asm>program4
Type a letter, please. _
The combined action of the carriage return/ line feed (CR/LF) sequence positions the
cursor at the start of the following line. The text of the user prompt, “type a letter, please”,
appears at the far left of the line immediately below the command line, because that is
where the command line left the DOS cursor pointer just before the program4 took over.
DOS Function $01
Lines 0018 and 0019 of program4 invoke DOS function $01: Keyboard Input with Echo.
This function is invoked when the INT $21 instruction is executed with a value of $01 in
the AH register. DOS function $01 waits for a keystroke at the keyboard. When a key is
pressed, it returns with the ASCII code for that key stored in the Al register, and echoes
that keystroke to the video screen.
DOS function $01 and $08 are identical except that function $01 displays the image of the
key that was pressed while function $08 does not. When the user presses a key, function
$01 echoes its image to the video screen at the current location of the cursor and advances
the cursor one position to its right:
Type a letter, please X
Before program4 displays the next line of its output, it must first generate a CR/LF
sequence. Otherwise, the next line would begin where the call to function $01 left the
cursor:
Type a letter, please x The letter you typed was x
Instead of beginning a new line:
Type a letter, please x
The letter you typed was x
There are several ways to generate a CR/LF. The most straightforward one involves using
a DOS function $02 to output a carriage return character, and then using it again to output
a line feed character. The SCII code for the carriage return is $0D. The ASCII code for a
line feed is $0A. The following code generates a CR/LF sequence:
MOV AH, $02
MOV DL, $0D
INT $21
MOV AH, $02
MOV DL, $0A
INT $21
The INC instruction
The INC (Increment) instruction, which appears at line 0064, has a general form
INC operand
This instruction increases the contents of the operand by a value of 1. In this case INC DL
adds 1 to the contents of the DL register. The combined effect of the instructions at lines
0021 and 0063 is to set the contents of the DL register to the ASCII code for the key read
in by the call to DOS function $01 at lines 0018 and 0019
The collective effect of lines 0062 through 0065 is to display the image of the letter that is
alphabetically one position after the letter whose ASCII code is in the BL register.
When you execute program4 from the DOS prompt, the screen appears something like
this:
C:\asm>program4
Type a letter, please. q
The letter you typed was q
The letter after q is r
The DEC instruction
The DEC (Decrement) instruction is the negative counter part of the INC instruction. The
general format for a DEC instruction is:
DEC operand
A DEC instruction subtracts a value of 1 from the contents of its operand.
0069
0070 ;display two spaces and a period.
0071 0048 B409 mov ah,$09
0072 004A BAXXXX mov dx, offset sp_sp_period
0073 004D CD21 int $21
0074
0075 ;exit to DOS
0076 004F B8004C mov ax,$4c00
0077 0052 CD21 int $21
0078 main endp
0079 ;********************************************************
0080 CRLF proc
0081 ;generate CR/LF
0082 0054 B402 mov ah,$02
0083 0056 B20D mov dl,$0d
0084 0058 CD21 int $21
0085 005A B402 mov ah,$02
0086 005C B20A mov dl,$0a
0087 005E CD21 int $21
0088 0060 C3 ret
0089 CRLF endp
0090 ;********************************************************
0091 code ends
0092 ;***************************************************
0093 stack segment stack $0400
0094 ;***************************************************
0095 data segment
0096 0000 user_promt db 'type a letter, please. $'
0097 0019 message db 'the letter you typed was $'
0098 0034 sp_sp_period db ' .$'
0099 0038 line_two db 'The letter after $'
0100 004B is_text db ' is $'
0101 data ends
0102 ;***************************************************
0103 end
The CPU does several things in the course of executing a CALL instruction. First, it
adjusts the contents of IP register as if it were about to execute the next instruction, but
then, instead of going on to do so, it records the contents of the IP register in the program
stack. Then it adjusts the contents of the IP register again, this time to point to the first
instruction in the named procedure.
The result is that immediately after processing a CALL procname instruction, the CPU
begins executing the first instruction in the named procedure. From there on it continues
executing instructions until it encounters a RET instruction.
The RET instruction
A RET (RETurn) instruction transfers program flow back from a subroutine to its parent.
When the CPU encounters a RET instruction, it recovers the note that the CALL
instruction directed it to leave for itself in the program stack. Then it puts the address it
reads there into the IP register and continues execution. This gets the CPU back to where
it was when the CALL instruction sidetracked it.
Placements of procedures
The order of the placement of procedure in an assembly language program is altogether
irrelevant, with one exception: the main routine should normally come first. By default
the linker sets things up so that execution begins with the first line in the source code.
Normally, that should be the first instruction in the main routine.
JUMPS
A program jump transfers program flow to the instruction at some specified location in
memory. An assembly language jump is analogous to a GOTO command in a higher-
level language. The format for an unconditional jump to an address specified by an
assembly language label is:
JMP label
where label is a program address identifier. A label consists of any valid assembler
identifier followed by a colon. Any instruction statement can be prefixed by a label. The
assembler treats a reference to a label as a reference to the address of the instruction to
which that label is affixed. The JMP instruction in this sequence.
JMP AX_ZERO
.
.
.
AX_ZERO: MOV AX, $0000
would cause the CPU to set the contents of the IP register to address the instruction
labeled AX_ZERO and to continue processing from there.
Branches
A program branch is a point in a program at which program flow can continue in either
of two paths. The path actually taken at a branch is selected under program control based
on the state of some condition. In higher level languages program branches are usually
represented as IF/THEN constructs.
In 8086/8088 assembly language, a program branch is implemented as a two-stage
process. First a condition is tested for, and then it is acted upon. High level languages
perform essentially the same operations when they implement an IF/THEN statement,
but they tend to obscure the fact that two steps are involved by combining both of them
into a single command.
In a higher level language, a conditional statement contains two functionally separate
elements, a test clause and an operational clause. In Pascal statement such as:
IF A==B THEN GOTO 100
If A==B is a test clause and THEN GOTO 100 is the operational clause. In assembly
language analogue of an IF/THEN statement, the role of the test clause is performed by
one instruction and the role of the operational clause is performed by another. The
8086/8088 mediates the transfer of information concerning the result of the test from the
first instruction to the second via the flag register.
The FLAG register
The flag register is a 16-bit register, six of whose bits are devoted to status flags and three
of its bits are devoted to control flags. The remaining seven bits in the flag register are
undefined. Generally speaking, the CPU adjusts the status flags in the course of executing
arithmetic operations.
They reflect the outcome of an operation and record information about that outcome in a
manner that renders the information accessible for use in the execution of subsequent
instructions. The control flags control the operation of the CPU in certain circumstances.
In general, most of the instructions that perform arithmetic calculations such as addition
or subtraction will adjust some subset of the status flags to reflect the outcome of their
operation. Two flags of particular interest to programmers are the Zero flag and a carry
flag. In general the zero flag will be set to when an arithmetic operation produces a result
of zero and cleared when an arithmetic operation produces a nonzero result. The carry
flag will be set by an operation that produces an unsatisfied carry and cleared by one that
does not.
Comparison
The CMP (CoMPare) instruction is frequently used to compare two values and to adjust
the status flags accordingly. The general format for the CMP instruction is:
CMP destination, source
The CMP instruction accomplishes its task by subtracting the value represented by the
contents of the source operand from the value represented by the contents of the
destination operand, but it does not store the result of that subtraction or affect the
contents of either operand. It merely reflects the status of the result it obtains in the status
flags. If the contents of the destination operand are equal to the contents of source
operand, the zero flag will be set; otherwise the zero flag will be cleared. If the contents
of the destination operand are below the contents of the source operand, the carry flag
will be set; if the contents of the destination operand are above or equal to the contents of
the source operand, carry flag will be cleared.
Conditional jumps
A conditional jump instruction will test the state of some specified status flag or flags and
direct program flow accordingly. In assembly language a conditional jump resembles the
second half of an
IF……THEN GOTO……….
statement in higher level language. In 8086/8088 assembly language, this pair of
instructions:
CMP AX, BX
JZ Label
Would be roughly equal to
IF AX==BX THEN GOTO Label
in higher level language.
The conditional jump instructions are all of the general form
Jxxx label
Where xxx describes the condition for which you are testing. The syntax of 8086/8088
assembly language supports 31 conditional jump instructions. Fifteen of those
instructions test a status flag or a combination of status flags and jump to the address
indicated by label if the prescribed condition is true. Another 15 conditional jumps are
the converse of the first 15; they force a jump if the condition proves false. For example,
the converse of JZ (Jump if Zero) is JNZ (Jump if Not Zero). Among the 15 pairs are a few
synonyms. For example, the assembler recognizes the mnemonics JE (Jump if equal) and
JNE (Jump if Not Equal) as logically indistinguishable from JZ and JNZ, respectively.
Six of the conditional jumps are specifically designed for branching based on the
outcome of arithmetic comparisons of unsigned numbers.
JB Jump if Below
JBE jump if Below or Equal
JE Jump if Equal
JNE jump if Not Equal
JAE Jump if Above or Equal
JA Jump if Above
These six conditional jumps test either the Zero flag or the carry flag or both the Zero and
Carry flags. For example, the test performed by the JB instruction will prove true if the
carry flag is set, and the test performed by the JBE instruction will prove true if either the
carry flag or the zero flag is set.
Four of the other conditional jumps are variations on those six instructions:
JNB Jump if Not Below
JNBE Jump if Not Below or Equal
JNAE Jump if Not Above or Equal
JNA Jump if Not Above
But each of these variations is really just a synonym for one of the original six. JNB, for
example, is equivalent to JAE.
The 31st conditional jump is a bit anomalous. Instead of testing the status flags, this
instruction tests the contents of the CX register for Zero. The mnemonic for this
instruction is JCXZ. It forces a jump to the address of an indicated label if the contents of
the CX register are zero.
The following sequence of the instructions an IF/THEN/ELSE structure that executes one
squib of code if the contents of the AX and BX registers are equal to one another and
another squib of code if they are not:
CMP AX, BX
JZ .
NOT_EQUAL: . ; If AX is NOT equal to BX
. ; Then execute these
. ; Statements
JMP END_IF
EQUAL: . ; If AX is equal to BX
. ; Then executes these
. ; Statement
END_IF:
>>>>>>>>>>>>>>>>>>>>>>>PROGRAM6<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
0001 list scr
0002 ;**********program5.6***********************************************
0003 ;*this program asks the the user to input a letter from *
0004 ;*the keyboard . It tests the input for extended *
0005 ;*ASCII code or some other non-letter condition *
0006 ;*If the input was acceptable, the program *
0007 ;*responds *
0008 ;* "The letter you typed was x ." *
0009 ;* "The letter after x is y ." *
0010 ;*If the input was unacceptable,the program gives *
0011 ;* an error message *
0012 ;*******************************************************************
0013 hex $
0014 code segment
0015 ;*******************************************************************
0016 main proc
0017 ;set the DS register.
0018 0000 B8XXXX mov ax,data
0019 0003 8ED8 mov ds,ax
0020
0021 ; read keyboard
0022 0005 E83500 call get_input
0023
0024 ;test Al register for beginning of
0025 ; extended code
0026 0008 3C00 cmp al,$00 COMPILED BY: SAMUEL G 29
0027 jz extended_code
0028
0029 ;test Al for character below 'a'
0030 000A 3C61 cmp al,$61
Computer Organization and Assembly Language Programming
Loops
Loop_Label:
.
.
.
DEC CX
JNZ Loop_Label
The DEC CX instruction will subtract 1 from the contents of the CX register and adjust
the Zero flag accordingly. program flow will continue to cycle through the loop until the
contents of the CX register are reduced to zero.
Example 7.1
Write a program that displays the following output using a loop without using data
allocation statement:
ABCD
ABCD
ABCD
ABCD
SOLUTION:
List scr
Hex $
Code segment
;set the ds register
Mov ax,data
Mov ds,ax
;assign a number to CX register and also CX
Mov CX, $05
Loop_outer:
Mov bl,$04
Mov bh,$41 ; $41 is the ASCII code in hexadecimal for letter ‘A ‘
Loop_inner:
Mov dl,bh
Mov ah,$02
Int $21 COMPILED BY: SAMUEL G 33
Inc bh
Dec bl
Jnz loop_inner
Call CRLF
Computer Organization and Assembly Language Programming
CRLF proc
;generate CR/LF
mov ah,$02
mov dl,$0d
int $21
mov ah,$02
mov dl,$0a
int $21
ret
CRLF endp
Code ends
Stack segment stack $0400
Data segment
Summary
• The translators that take an entire program and translate it as a body in to machine
language are called compilers.
• Translators that process programs one line at a time are called interpreters
• Special purpose translators that are specifically designed to translate assembly
language programs in to machine language are called assemblers.
• Assembling a program converts its source code in to an OBJect file. An OBJect file
contains the machine language image of the source code of a program in skeletal
form
• There are three kinds of statements in the source code of an 8086/8088 assembly
language program: instruction statements, data allocation statements, and
directives.
• A comment is a string of text that clarifies about the program but not part of the
program. A semicolon identifies all subsequent text in a statement as a comment.
• The Hex directive directs the assembler to treat tokens in the source file that begin
with a dollar sign as numeric constants in hexadecimal notation. A HEX directive
COMPILED BY: SAMUEL G 34
Computer Organization and Assembly Language Programming
contains only the source code that follows it, so it is customary to place the HEX
directive at the beginning of the program. If the HEX directive had not been
included in program2, the assembler would have processed the tokens beginning
with the dollar signs as identifiers instead of as numeric values in hexadecimal
notation
• A segment directive defines the logical segment to which subsequent instructions
and data allocation statements belong. It also gives a segment name to the base of
that segment.
• The first segment directive in program introduces a logical segment named CODE.
By default the linker assumes that the first segment in a program is its code
segment.
• The second segment directive in program defines the program’s stack segment.
• The third and the last segment in the program is a data segment. It contains a single
data allocation statement.
• The general format for a data allocation statement is :
[Varname] data-definition-type [ init [[,init]]]
• The MOV instruction copies the contents of the source operand into the
destination operand
MOV destination, source
• The offset of Message is the distance in bytes from the beginning of the data
segment to the first byte in Message.
MOV dx, Offset message
• DOS function $01: keyboard input with Echo.
• DOS function $02: character output.
• DOS function $08: keyboard input without echo.
• DOS function $09: string output.
• The flag register is a 16-bit register, six of whose bits are devoted to status flags
and three of its bits are devoted to control flags. The remaining seven bits in the
flag register are undefined. Generally speaking, the CPU adjusts the status flags in
the course of executing arithmetic operations.