Basic Description of A Computer System
Basic Description of A Computer System
Basic Description of A Computer System
This section has the purpose of giving a brief outline of the main components of a computer system at a basic
level, which will allow the user a greater understanding of the concepts which will be dealt with throughout the
tutorial.
Computer System
We call computer system to the complete configuration of a computer, including the peripheral units and the
system programming which make it a useful and functional machine for a determined task.
Central Processor
This part is also known as central processing unit or CPU, which in turn is made by the control unit and the
arithmetic and logic unit. Its functions consist in reading and writing the contents of the memory cells, to forward
data between memory cells and special registers, and decode and execute the instructions of a program. The
processor has a series of memory cells which are used very often and thus, are part of the CPU. These cells are
known with the name of registers. A processor may have one or two dozen of these registers. The arithmetic and
logic unit of the CPU realizes the operations related with numeric and symbolic calculations. Typically these units
only have capacity of performing very elemental operations such as: the addition and subtraction of two whole
numbers, whole number multiplication and division, handling of the registers' bits and the comparison of the
content of two registers. Personal computers can be classified by what is known as word size, this is, the quantity
of bits which the processor can handle at a time.
Central Memory
It is a group of cells, now being fabricated with semi-conductors, used for general processes, such as the execution
of programs and the storage of information for the operations.
Each one of these cells may contain a numeric value and they have the property of being addressable, this is, that
they can distinguish one from another by means of a unique number or an address for each cell.
The generic name of these memories is Random Access Memory or RAM. The main disadvantage of this type of
memory is that the integrated circuits lose the information they have stored when the electricity flow is
interrupted. This was the reason for the creation of memories whose information is not lost when the system is
turned off. These memories receive the name of Read Only Memory or ROM.
In order for a computer to be useful to us it is necessary that the processor communicates with the exterior
through interfaces which allow the input and output of information from the processor and the memory. Through
the use of these communications it is possible to introduce information to be processed and to later visualize the
processed data.
Some of the most common input units are keyboards and mice. The most common output units are screens and
printers.
Auxiliary Memory Units
Since the central memory of a computer is costly, and considering today's applications it is also very limited. Thus,
the need to create practical and economical information storage systems arises. Besides, the central memory loses
its content when the machine is turned off, therefore making it inconvenient for the permanent storage of
data.These and other inconvenience give place for the creation of peripheral units of memory which receive the
name of auxiliary or secondary memory. Of
these the most common are the tapes and magnetic discs.
The stored information on these magnetic media means receive the name of files. A file is made of a variable
number of registers, generally of a fixed size; the registers may contain information or programs.
Information Units
In order for the PC to process information, it is necessary that this information be in special cells called registers.
The registers are groups of 8 or 16 flip-flops.
A flip-flop is a device capable of storing two levels of voltage, a low one, regularly 0.5 volts, and another one,
commonly of 5 volts. The low level of energy in the flip-flop is interpreted as off or 0, and the high level as on or
1. These states are usually known as bits, which are the smallest information unit in a computer.
A group of 16 bits is known as word; a word can be divided in groups of 8 bits called bytes, and the groups of 4 bits
are called nibbles.
Numeric systems
The numeric system we use daily is the decimal system, but this system is not convenient for machines since the
information is handled codified in the shape of on or off bits; this way of codifying takes us to the necessity of
knowing the positional calculation which will allow us to express a number in any base where we need it.
It is possible to represent a determined number in any base through the following formula:
Where n is the position of the digit beginning from right to left and numbering from zero. D is the digit on which
we operate and B is the used numeric base.
TOP
When working with assembly language we come on the necessity of converting numbers from the binary system,
which is used by computers, to the decimal system used by people.
The binary system is based on only two conditions or states, be it on(1) or off(0), thus its base is two.
Binary: 1 1 0 0 1
= 1 + 2 + 0 + 0 + 16 = 19 decimal.
The ^ character is used in computation as an exponent symbol and the * character is used to represent
multiplication.
There are several methods to convert decimal numbers to binary; only one will be analyzed here. Naturally a
conversion with a scientific calculator is much easier, but one cannot always count with one, so it is convenient to
at least know one formula to do it.
The method that will be explained uses the successive division of two, keeping the residue as a binary digit and
the result as the next number to divide.
Building the number from the bottom , we get that the binary result is
101011
Hexadecimal system
On the hexadecimal base we have 16 digits which go from 0 to 9 and from the letter A to the F, these letters
represent the numbers from 10 to 15. Thus we count 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E, and F.
The conversion between binary and hexadecimal numbers is easy. The first thing done to do a conversion of a
binary number to a hexadecimal is to divide it in groups of 4 bits, beginning from the right to the left. In case the
last group, the one most to the left, is under 4 bits, the missing places are filled with zeros.
Taking as an example the binary number of 101011, we divide it in 4 bits
groups and we are left with:
10;1011
Filling the last group with zeros (the one from the left):
0010;1011
Afterwards we take each group as an independent number and we consider its decimal value:
0010=2;1011=11
But since we cannot represent this hexadecimal number as 211 because it would be an error, we have to substitute
all the values greater than 9 by their respective representation in hexadecimal, with which we obtain:
In order to convert a hexadecimal number to binary it is only necessary to invert the steps: the first hexadecimal
digit is taken and converted to binary, and then the second, and so on.
TOP
ASCII code
ASCII is an acronym of American Standard Code for Information Interchange. This code assigns the letters of the
alphabet, decimal digits from 0 to 9 and some additional symbols a binary number of 7 bits, putting the 8th bit
in its off state or 0. This way each letter, digit or special character occupies one byte in the computer memory.
We can observe that this method of data representation is very inefficient on the numeric aspect, since in binary
format one byte is not enough to represent numbers from 0 to 255, but on the other hand with the ASCII code one
byte may represent only one digit. Due to this inefficiency, the ASCII code is mainly used in the memory to
represent text.
BCD Method
BCD is an acronym of Binary Coded Decimal. In this notation groups of 4 bits are used to represent each decimal
digit from 0 to 9. With this method we can represent two digits per byte of information.
Even when this method is much more practical for number representation in the memory compared to the ASCII
code, it still less practical than the binary since with the BCD method we can only represent digits from 0 to 99.
On the other hand in binary format we can represent all digits from 0 to 255.
This format is mainly used to represent very large numbers in mercantile applications since it facilitates operations
avoiding mistakes.
Floating point representation
This representation is based on scientific notation, this is, to represent a number in two parts: its base and its
exponent.
As an example, the number 1234000, can be represented as 1.123*10^6, in this last notation the exponent
indicates to us the number of spaces that the decimal point must be moved to the right to obtain the original
result.
In case the exponent was negative, it would be indicating to us the number of spaces that the decimal point must
be moved to the left to obtain the original result.
Design of the algorithm, stage the problem to be solved is established and the best solution is proposed, creating
squematic
diagrams used for the better solution proposal. Coding the algorithm, consists in writing the program in some
programming language; assembly language in this specific case, taking as a base the proposed solution on the prior
step. Translation to machine language, is the creation of the object program, in other words, the written program
as a sequence of zeros and
ones that can be interpreted by the processor. Test the program, after the translation the program into machine
language, execute the program in the computer machine. The last stage is the elimination of detected faults on
the
program on the test stage. The correction of a fault normally requires the repetition of all the steps from the first
or second.
TOP
CPU Registers
The CPU has 4 internal registers, each one of 16 bits. The first four, AX, BX, CX, and DX are general use registers
and can also be used as 8 bit registers, if used in such a way it is necessary to refer to them for example as: AH
and AL, which are the high and low bytes of the AX register. This nomenclature is also applicable to the BX, CX,
and DX registers.
AX Accumulator
BX Base register
CX Counting register
DX Data register
DS Data segment register
ES Extra segment register
SS Battery segment register
CS Code segment register
BP Base pointers register
SI Source index register
DI Destiny index register
SP Battery pointer register
IP Next instruction pointer register
F Flag register
Debug program
To create a program in assembler two options exist, the first one is to use the TASM or Turbo Assembler, of
Borland, and the second one is to use the debugger - on this first section we will use this last one since it is found
in any PC with the MS-DOS, which makes it available to any user who has access to a machine with these
characteristics.
Debug can only create files with a .COM extension, and because of the characteristics of these kinds of programs
they cannot be larger that 64 kb, and they also must start with displacement, offset, or 0100H memory direction
inside the specific segment.
Debug provides a set of commands that lets you perform a number of useful
operations:
It is possible to visualize the values of the internal registers of the CPU using the Debug program. To begin working
with Debug, type the following prompt in your computer:
C:/>Debug [Enter]
On the next line a dash will appear, this is the indicator of Debug, at this moment the instructions of Debug can be
introduced using the following command:
-r[Enter]
AX=0000 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=0D62 ES=0D62 SS=0D62 CS=0D62 IP=0100 NV EI PL NZ NA PO NC
0D62:0100 2E CS:
0D62:0101 803ED3DF00 CMP BYTE PTR [DFD3],00 CS:DFD3=03
All the contents of the internal registers of the CPU are displayed; an
alternative of viewing them is to use the "r" command using as a parameter
the name of the register whose value wants to be seen. For example:
-rbx
BX 0000
:
This instruction will only display the content of the BX register and the Debug indicator changes from "-" to ":"
When the prompt is like this, it is possible to change the value of the register which was seen by typing the new
value and [Enter], or the old value can be left by pressing [Enter] without typing any other value.
TOP
Assembler structure
In assembly language code lines have two parts, the first one is the name of the instruction which is to be
executed, and the second one are the parameters of the command. For example: add ah bh
Here "add" is the command to be executed, in this case an addition, and "ah" as well as "bh" are the parameters.
In the above example, we are using the instruction mov, it means move the value 25 to al register.
The name of the instructions in this language is made of two, three or four letters. These instructions are also
called mnemonic names or operation codes, since they represent a function the processor will perform.
add al,[170]
The brackets in the second parameter indicate to us that we are going to work with the content of the memory
cell number 170 and not with the 170 value, this is known as direct addressing.
The first step is to initiate the Debug, this step only consists of typing debug[Enter] on the operative system
prompt.
To assemble a program on the Debug, the "a" (assemble) command is used; when this command is used, the
address where you want the assembling to begin can be given as a parameter, if the parameter is omitted the
assembling will be initiated at the locality specified by CS:IP, usually 0100h, which is the locality where programs
with .COM extension must be
initiated. And it will be the place we will use since only Debug can create this specific type of programs.
Even though at this moment it is not necessary to give the "a" command a parameter, it is recommendable to do so
to avoid problems once the CS:IP registers are used, therefore we type:
a 100[enter]
mov ax,0002[enter]
mov bx,0004[enter]
add ax,bx[enter]
nop[enter][enter]
What does the program do?, move the value 0002 to the ax register, move the value 0004 to the bx register, add
the contents of the ax and bx registers, the instruction, no operation, to finish the program.
In the debug program. After to do this, appear on the screen some like the follow lines:
C:\>debug
-a 100
0D62:0100 mov ax,0002
0D62:0103 mov bx,0004
0D62:0106 add ax,bx
0D62:0108 nop
0D62:0109
Type the command "t" (trace), to execute each instruction of this program,
example:
-t
You see that the value 2 move to AX register. Type the command "t" (trace),
again, and you see the second instruction is executed.
-t
Type the command "t" (trace) to see the instruction add is executed, you will see the follow lines:
-t
The possibility that the registers contain different values exists, but AX and BX must be the same, since they are
the ones we just modified.
TOP
It would not seem practical to type an entire program each time it is needed, and to avoid this it is possible to
store a program on the disk, with the enormous advantage that by being already assembled it will not be necessary
to run Debug again to execute it.
Obtain the length of the program subtracting the final address from the initial address, naturally in hexadecimal
system.
Give the program a name and extension. Put the length of the program on the CX register. Order Debug to write
the program on the disk.
By using as an example the following program, we will have a clearer idea of how to take these steps:
To obtain the length of a program the "h" command is used, since it will show us the addition and subtraction of
two numbers in hexadecimal. To obtain the length of ours, we give it as parameters the value of our program's
final address (10A), and the program's initial address (100). The first result the command shows us is the addition
of the parameters and the
second is the subtraction.
-h 10a 100
020a 000a
The "rcx" command allows us to change the content of the CX register to the value we obtained from the size of
the file with "h", in this case 000a, since the result of the subtraction of the final address from the initial address.
-rcx
CX 0000
:000a
Lastly, the "w" command writes our program on the disk, indicating how many bytes it wrote.
-w
Writing 000A bytes
To obtain the correct result of the following steps, it is necessary that the above program be already created.
-n test.com
-l
-u 100 109
0C3D:0100 B80200 MOV AX,0002
0C3D:0103 BB0400 MOV BX,0004
0C3D:0106 01D8 ADD AX,BX
0C3D:0108 CD20 INT 20
The last "u" command is used to verify that the program was loaded on memory. What it does is that it
disassembles the code and shows it disassembled. The parameters indicate to Debug from where and to where to
disassemble.
Debug always loads the programs on memory on the address 100H, otherwise indicated.
First an editor to create the source program. Second a compiler, which is nothing more than a program that
"translates" the source program into an object program. And third, a linker that generates the executable program
from the object program.
The editor can be any text editor at hand, and as a compiler we will use the TASM macro assembler from Borland,
and as a linker we will use the Tlink program.
The extension used so that TASM recognizes the source programs in assembler is .ASM; once translated the source
program, the TASM creates a file with the .OBJ extension, this file contains an "intermediate format" of the
program, called like this because it is not executable yet but it is not a program in source language either
anymore. The linker generates, from a
.OBJ or a combination of several of these files, an executable program, whose extension usually is .EXE though it
can also be .COM, depending of the form it was assembled.
Assembler Programming
To build assembler programs using TASM programs is a different program structure than from using debug program.
.MODEL SMALL
Assembler directive that defines the memory model to use in the program
.CODE
Assembler directive that defines the program instructions
.STACK
Assembler directive that reserves a memory space for program instructions
in the stack
END
Assembler directive that finishes the assembler program
Let's program
First step
use any editor program to create the source file. Type the following lines:
TOP
first example
Second step
Save the file with the following name: examp1.asm Don't forget to save this in ASCII format.
Third step
Example:
C:\>tasm exam1.asm
Turbo Assembler Version 2.0 Copyright (c) 1988, 1990 Borland International
The TASM can only create programs in .OBJ format, which are not executable by themselves, but rather it is
necessary to have a linker which generates the executable code.
Fourth step
C:\>tlink exam1.obj
Turbo Link Version 3.0 Copyright (c) 1987, 1990 Borland International
C:\>
Where exam1.obj is the name of the intermediate program, .OBJ. This generates a file directly with the name of
the intermediate program and the .EXE extension.
Fifth step
C:\>exam1[enter]
Assembly process.
TOP
SEGMENTS
The architecture of the x86 processors forces to the use of memory segments to manage the information, the size
of these segments is of 64kb.
The reason of being of these segments is that, considering that the maximum size of a number that the processor
can manage is given by a word of 16 bits or register, it would not be possible to access more than 65536 localities
of memory using only one of these registers, but now, if the PC's memory is divided into groups or segments, each
one of 65536 localities, and we use an address on an exclusive register to find each segment, and then we make
each address of a specific slot with two registers, it is possible for us to access a quantity of 4294967296 bytes of
memory, which is, in the present day, more memory than what we will see installed in a PC.
In order for the assembler to be able to manage the data, it is necessary that each piece of information or
instruction be found in the area that corresponds to its respective segments. The assembler accesses this
information taking into account the localization of the segment, given by the DS, ES, SS and CS registers and inside
the register the address of the specified piece of information. It is because of this that when we create a program
using the Debug on each line that we assemble, something like this appears:
Where the first number, 1CB0, corresponds to the memory segment being used, the second one refers to the
address inside this segment, and the instructions which will be stored from that address follow. The way to
indicate to the assembler with which of the segments we will work with is with the .CODE, .DATA and .STACK
directives.
The assembler adjusts the size of the segments taking as a base the number of bytes each assembled instruction
needs, since it would be a waste of memory to use the whole segments. For example, if a program only needs 10kb
to store data, the data segment will only be of 10kb and not the 64kb it can handle.
SYMBOLS CHART
Each one of the parts on code line in assembler is known as token, for example on the code line:
MOV AX,Var
we have three tokens, the MOV instruction, the AX operator, and the VAR operator. What the assembler does to
generate the OBJ code is to read each one of the tokens and look for it on an internal "equivalence" chart known
as the reserved words chart, which is where all the mnemonic meanings we use as instructions are found.
Following this process, the assembler reads MOV, looks for it on its chart and identifies it as a processor
instruction. Likewise it reads AX and recognizes it as a register of the processor, but when it looks for the Var
token on the reserved words chart, it does not find it, so then it looks for it on the symbols chart which is a table
where the names of the variables, constants and labels used in the program where their addresses on memory are
included and the sort of data it contains, are found.
Sometimes the assembler comes on a token which is not defined on the program, therefore what it does in these
cased is to pass a second time by the source program to verify all references to that symbol and place it on the
symbols chart.There are symbols which the assembler will not find since they do not belong to that segment and
the program does not know in what part of the memory it will find that segment, and at this time the linker comes
into action, which will create the structure necessary for the loader so that the segment and the token be defined
when the program is loaded and before it is executed.
TOP
Another example
first step
use any editor program to create the source file. Type the following lines:
;example11
.model small
.stack
.code
mov ah,2h ;moves the value 2h to register ah
mov dl,2ah ;moves de value 2ah to register dl
;(Its the asterisk value in ASCII format)
int 21h ;21h interruption
mov ah,4ch ;4ch function, goes to operating system
int 21h ;21h interruption
end ;finishes the program code
second step
third step
C:\>tasm exam2.asm
Turbo Assembler Version 2.0 Copyright (c) 1988, 1990 Borland International
Assembling file: exam2.asm
Error messages: None
Warning messages: None
Passes: 1
Remaining memory: 471k
fourth step
C:\>
fifth step
C:\>ejem11[enter]
*
C:\>
This assembler program shows the asterisk character on the computer screen
TOP
Types of instructions.
Data movement
In any program it is necessary to move the data in the memory and in the CPU registers; there are several ways to
do this: it can copy data in the memory to some register, from register to register, from a register to a stack, from
a stack to a register, to transmit data to external devices as well as vice versa.
This movement of data is subject to rules and restrictions. The following are some of them:
*It is not possible to move data from a memory locality to another directly; it is necessary to first move the data of
the origin locality to a register and then from the register to the destiny locality.
*It is not possible to move a constant directly to a segment register; it first must be moved to a register in the
CPU.
It is possible to move data blocks by means of the movs instructions, which copies a chain of bytes or words; movsb
which copies n bytes from a locality to another; and movsw copies n words from a locality to another. The last two
instructions take the values from the defined addresses by DS:SI as a group of data to move and ES:DI as the new
localization of the
data.
To move data there are also structures called batteries, where the data is introduced with the push instruction
and are extracted with the pop instruction.
In a stack the first data to be introduced is the last one we can take, this is, if in our program we use these
instructions:
PUSH AX
PUSH BX
PUSH CX
To return the correct values to each register at the moment of taking them from the stack it is necessary to do it
in the following order:
POP CX
POP BX
POP AX
For the communication with external devices the out command is used to send information to a port and the in
command to read the information received from a port.
OUT DX,AX
Where DX contains the value of the port which will be used for the communication and AX contains the information
which will be sent.
IN AX,DX
Where AX is the register where the incoming information will be kept and DX contains the address of the port by
which the information will arrive.
The instructions of the logic operations are: and, not, or and xor. These work on the bits of their operators.
To verify the result of the operations we turn to the cmp and test instructions. The instructions used for the
algebraic operations are: to add, to subtract sub, to multiply mul and to divide div.Almost all the comparison
instructions are based on the information contained in the flag register. Normally the flags of this register which
can be directly handled by the programmer are the data direction flag DF, used to define the operations about
chains. Another one which can also be
handled is the IF flag by means of the sti and cli instructions, to activate and deactivate the interruptions.
The unconditional jumps in a written program in assembler language are given by the jmp instruction; a jump is to
moves the flow of the execution of a program by sending the control to the indicated address.
A loop, known also as iteration, is the repetition of a process a certain number of times until a condition is
fulfilled.
Transfer instructions
They are used to move the contents of the operators. Each instruction can be used with different
modes of addressing.
MOV
MOVS (MOVSB) (MOVSW)
MOV INSTRUCTION
Purpose: Data transfer between memory cells, registers and the accumulator.
Syntax:
Where Destiny is the place where the data will be moved and Source is the place where the data
is.
Example:
MOV AX,0006h
MOV BX,AX
MOV AX,4C00h
INT 21H
This small program moves the value of 0006H to the AX register, then it moves the content of
AX (0006h) to the BX register, and lastly it moves the 4C00h value to the AX register to end the
execution with the 4C option of the 21h interruption.
Purpose: To move byte or word chains from the source, addressed by SI, to the destiny addressed
by DI.
Syntax:
MOVS
This command does not need parameters since it takes as source address the content of the SI
register and as destination the content of DI. The following sequence of instructions illustrates
this:
First we initialize the values of SI and DI with the addresses of the VAR1 and VAR2 variables
respectively, then after executing MOVS the content of VAR1 is copied onto VAR2.
The MOVSB and MOVSW are used in the same way as MOVS, the first one moves one byte
and the second one moves a word.
TOP
Loading instructions
They are specific register instructions. They are used to load bytes or chains of bytes onto a
register.
Syntax:
LODS
This instruction takes the chain found on the address specified by SI, loads it to the AL (or AX)
register and adds or subtracts , depending on the state of DF, to SI if it is a bytes transfer or if it
is a words transfer.
The first line loads the VAR1 address on SI and the second line takes the content of that locality
to the AL register.
The LODSB and LODSW commands are used in the same way, the first one loads a byte and the
second one a word (it uses the complete AX register).
LAHF INSTRUCTION
Syntax:
LAHF
This instruction is useful to verify the state of the flags during the execution of our program.
The flags are left in the following order inside the register:
SF ZF ?? AF ?? PF ?? CF
LDS INSTRUCTION
Syntax:
The source operator must be a double word in memory. The word associated with the largest
address is transferred to DS, in other words it is taken as the segment address. The word
associated with the smaller address is the displacement address and it is deposited in the register
indicated as destiny.
LEA INSTRUCTION
Syntax:
The source operator must be located in memory, and its displacement is placed on the index
register or specified pointer in destiny.
To illustrate one of the facilities we have with this command let us write an equivalence:
Is equivalent to:
LEA SI,VAR1
It is very probable that for the programmer it is much easier to create extensive programs by
using this last format.
LES INSTRUCTION
Syntax:
The source operator must be a double word operator in memory. The content of the word with
the larger address is interpreted as the segment address and it is placed in ES. The word with the
smaller address is the displacement address and it is placed in the specified register on the
destiny parameter.
TOP
Stack instructions
These instructions allow the use of the stack to store or retrieve data.
POP
POPF
PUSH
PUSHF
POP INSTRUCTION
Syntax:
POP destiny
This instruction transfers the last value stored on the stack to the destiny operator, it then
increases by 2 the SP register. This increase is due to the fact that the stack grows from the
highest memory segment address to the lowest, and the stack only works with words, 2 bytes, so
then by increasing by two the SP register, in reality two are being subtracted from the real size of
the stack.
POPF INSTRUCTION
Syntax:
POPF
This command transfers bits of the word stored on the higher part of the stack to the flag register.
BIT FLAG
0 CF
2 PF
4 AF
6 ZF
7 SF
8 TF
9 IF
10 DF
11 OF
PUSH INSTRUCTION
Syntax:
PUSH source
The PUSH instruction decreases by two the value of SP and then transfers the content of the
source operator to the new resulting address on the recently modified register.
The decrease on the address is due to the fact that when adding values to the stack, this one
grows from the greater to the smaller segment address, therefore by subtracting 2 from the SP
register what we do is to increase the size of the stack by two bytes, which is the only quantity of
information the stack can handle on each input and output of information.
PUSHF INSTRUCTION
Syntax:
PUSHF
This command decreases by 2 the value of the SP register and then the content of the flag
register is transferred to the stack, on the address indicated by SP.
The flags are left stored in memory on the same bits indicated on the POPF command.
TOP
Logic instructions
AND
NEG
NOT
OR
TEST
XOR
AND INSTRUCTION
Syntax:
With this instruction the "y" logic operation for both operators is carried
out:
NEG INSTRUCTION
Syntax:
NEG destiny
This instruction generates the complement to 2 of the destiny operator and stores it on the same
operator.
NEG AX
NOT INSTRUCTION
Purpose: It carries out the negation of the destiny operator bit by bit.
Syntax:
NOT destiny
OR INSTRUCTION
Syntax:
OR destiny, source
The OR instruction carries out, bit by bit, the logic inclusive disjunction
of the two operators:
TEST INSTRUCTION
Syntax:
It performs a conjunction, bit by bit, of the operators, but differing from AND, this instruction
does not place the result on the destiny operator, it only has effect on the state of the flags.
XOR INSTRUCTION
Purpose: OR exclusive
Syntax:
XOR destiny, source Its function is to perform the logic exclusive disjunction of the two
operators bit by bit.
TOP
Arithmetic instructions
ADC
ADD
DIV
IDIV
MUL
IMUL
SBB
SUB
ADC INSTRUCTION
Syntax:
It carries out the addition of two operators and adds one to the result in case the CF flag is
activated, this is in case there is carried.
ADD INSTRUCTION
Syntax:
It adds the two operators and stores the result on the destiny operator.
DIV INSTRUCTION
Syntax:
DIV source
The divider can be a byte or a word and it is the operator which is given the instruction.
If the divider is 8 bits, the 16 bits AX register is taken as dividend and if the divider is 16 bits the
even DX:AX register will be taken as dividend, taking the DX high word and AX as the low.
If the divider was a byte then the quotient will be stored on the AL register and the residue on
AH, if it was a word then the quotient is stored on AX and the residue on DX.
IDIV INSTRUCTION
Syntax:
IDIV source
It basically consists on the same as the DIV instruction, and the only difference is that this one
performs the operation with sign.For its results it used the same registers as the DIV instruction.
MUL INSTRUCTION
Syntax:
MUL source
The assembler assumes that the multiplicand will be of the same size as the multiplier, therefore
it multiplies the value stored on the register given as operator by the one found to be contained in
AH if the multiplier is 8 bits or by AX if the multiplier is 16 bits. When a multiplication is done
with 8 bit values, the result is stored on the AX register and when the multiplication is with 16
bit values the result is stored on the even DX:AX register.
IMUL INSTRUCTION
Syntax:
IMUL source
This command does the same as the one before, only that this one does take into account the
signs of the numbers being multiplied.
The results are kept in the same registers that the MOV instruction uses.
SBB INSTRUCTION
Syntax:
SBB destiny, source
This instruction subtracts the operators and subtracts one to the result if CF is activated. The
source operator is always subtracted from the destiny.
This kind of subtraction is used when one is working with 32 bits quantities.
SUB INSTRUCTION
Purpose: Subtraction.
Syntax:
TOP
Jump instructions
They are used to transfer the flow of the process to the indicated
operator.
JMP
JA (JNBE)
JAE (JNBE)
JB (JNAE)
JBE (JNA)
JE (JZ)
JNE (JNZ)
JG (JNLE)
JGE (JNL)
JL (JNGE)
JLE (JNG)
JC
JNC
JNO
JNP (JPO)
JNS
JO
JP (JPE)
JS
JMP INSTRUCTION
Purpose: Unconditional jump.
Syntax:
JMP destiny
This instruction is used to deviate the flow of a program without taking into account the actual
conditions of the flags or of the data.
JA (JNBE) INSTRUCTION
Syntax:
JA Label
After a comparison this command jumps if it is or jumps if it is not down or if not it is the equal.
This means that the jump is only done if the CF flag is deactivated or if the ZF flag is
deactivated, that is that one of the two be equal to zero.
Syntax:
JAE label
JB (JNAE) INSTRUCTION
Syntax:
JB label
Syntax:
JBE label
JE (JZ) INSTRUCTION
Syntax:
JE label
Syntax:
JNE label
JG (JNLE) INSTRUCTION
Syntax:
JG label
Syntax:
JGE label
JL (JNGE) INSTRUCTION
Syntax:
JL label
Syntax:
JLE label
JC INSTRUCTION
Purpose: Conditional jump, and the flags are taken into account.
Syntax:
JC label
JNC INSTRUCTION
Purpose: Conditional jump, and the state of the flags is taken into
account.
Syntax:
JNC label
JNO INSTRUCTION
Purpose: Conditional jump, and the state of the flags is taken into
account.
Syntax:
JNO label
Purpose: Conditional jump, and the state of the flags is taken into
account.
Syntax:
JNP label
It jumps if there is no parity or if the parity is uneven.
JNS INSTRUCTION
Purpose: Conditional jump, and the state of the flags is taken into account.
Syntax:
JNP label
JO INSTRUCTION
Purpose: Conditional jump, and the state of the flags is taken into account.
Syntax:
JO label
JP (JPE) INSTRUCTION
Purpose: Conditional jump, the state of the flags is taken into account.
Syntax:
JP label
JS INSTRUCTION
Purpose: Conditional jump, and the state of the flags is taken into account.
Syntax:
JS label
TOP
They transfer the process flow, conditionally or unconditionally, to a destiny, repeating this
action until the counter is zero.
LOOP
LOOPE
LOOPNE
LOOP INSTRUCTION
Syntax:
LOOP label
The loop instruction decreases CX on 1, and transfers the flow of the program to the label given
as operator if CX is different than 1.
LOOPE INSTRUCTION
Syntax:
LOOPE label
This instruction decreases CX by 1. If CX is different to zero and ZF is equal to 1, then the flow
of the program is transferred to the label indicated as operator.
LOOPNE INSTRUCTION
LOOPNE label
This instruction decreases one from CX and transfers the flow of the program only if ZF is
different to 0.
Counting instructions
DEC
INC
DEC INSTRUCTION
Syntax:
DEC destiny
This operation subtracts 1 from the destiny operator and stores the new value in the same
operator.
INC INSTRUCTION
Syntax:
INC destiny The instruction adds 1 to the destiny operator and keeps the result in the same
destiny operator.
Comparison instructions
They are used to compare operators, and they affect the content of the flags.
CMP
CMPS (CMPSB) (CMPSW)
CMP INSTRUCTION
Purpose: To compare the operators.
Syntax:
This instruction subtracts the source operator from the destiny operator but without this one
storing the result of the operation, and it only affects the state of the flags.
Syntax:
With this instruction the chain of source characters is subtracted from the destiny chain.
DI is used as an index for the extra segment of the source chain, and SI as an index of the destiny
chain.
It only affects the content of the flags and DI as well as SI are incremented.
Flag instructions
CLC
CLD
CLI
CMC
STC
STD
STI
CLC INSTRUCTION
Syntax:
CLC
This instruction turns off the bit corresponding to the cartage flag, or in other words it puts it on
zero.
CLD INSTRUCTION
Syntax:
CLD
This instruction turns off the corresponding bit to the address flag.
CLI INSTRUCTION
Syntax:
CLI
This instruction turns off the interruptions flag, disabling this way those maskarable
interruptions.
A maskarable interruptions is that one whose functions are deactivated when IF=0.
CMC INSTRUCTION
Syntax:
CMC
This instruction complements the state of the CF flag, if CF = 0 the instructions equals it to 1,
and if the instruction is 1 it equals it to 0.
STC INSTRUCTION
Syntax:
STC
STD INSTRUCTION
Syntax:
STD
STI INSTRUCTION
Syntax:
STI
The instruction activates the IF flag, and this enables the maskarable external interruptions ( the
ones which only function when IF = 1).
Internal interruptions are generated by certain events which come during the execution of a program.
This type of interruptions are managed on their totality by the hardware and it is not possible to modify them.
A clear example of this type of interruptions is the one which actualizes the counter of the computer internal
clock, the hardware makes the call to this interruption several times during a second in order to maintain the time
to date.
Even though we cannot directly manage this interruption, since we cannot control the time dating by means of
software, it is possible to use its effects on the computer to our benefit, for example to create a "virtual clock"
dated continuously thanks to the clock's internal counter. We only have to write a program which reads the actual
value of the counter and to translates it into an understandable format for the user.
External interruptions are generated by peripheral devices, such as keyboards, printers, communication cards, etc.
They are also generated by coprocessors. It is not possible to deactivate external interruptions.
These interruptions are not sent directly to the CPU, but rather they are sent to an integrated circuit whose
function is to exclusively handle this type of interruptions. The circuit, called PIC8259A, is controlled by the CPU
using for this control a series of communication ways called paths.
Software interruptions
Software interruptions can be directly activated by the assembler invoking the number of the desired interruption
with the INT instruction.
The use of interruptions helps us in the creation of programs, and by using them our programs are shorter, it is
easier to understand them and they usually have a better performance mostly due to their smaller size.
This type of interruptions can be separated in two categories: the operative system DOS interruptions and the BIOS
interruptions.
The difference between the two is that the operative system interruptions are easier to use but they are also
slower since these interruptions make use of the BIOS to achieve their goal, on the other hand the BIOS
interruptions are much faster but they have the disadvantage that since they are part of the hardware, they are
very specific and can vary depending even on the brand of the maker of the circuit.
The election of the type of interruption to use will depend solely on the characteristics you want to give your
program: speed, using the BIOS ones, or portability, using the ones from the DOS.
21H Interruption
Purpose: To call on diverse DOS functions.
Syntax:
Int 21H
Note: When we work in TASM program is necessary to specify that the value we are using is hexadecimal.
This interruption has several functions, to access each one of them it is necessary that the function number which
is required at the moment of calling the interruption is in the AH register.
In this section only the specific task of each function is exposed, for a reference about the concepts used, refer to
unit 7, titled : "Introduction to file handling".
FCB Method
Handles
02H FUNCTION
Use:
Calling registers:
AH = 02H
DL = Value of the character to display.
Return registers:
None.
This function displays the character whose hexadecimal code corresponds to the value stored in the DL register,
and no register is modified by using this command.
Use:
Call registers:
AH = 09H
DS:DX = Address of the beginning of a chain of characters.
Return registers:
None.
This function displays the characters, one by one, from the indicated address in the DS:DX register until finding a $
character, which is interpreted as the end of the chain.
40H FUNCTION
Use:
Call registers:
AH = 40H
BX = Path of communication
CX = Quantity of bytes to write
DS:DX = Address of the beginning of the data to write
Return registers:
AX = Error code
The use of this function to display information on the screen is done by giving the BX register the value of 1 which
is the preassigned value to the video by the operative system MS-DOS.
01H FUNCTION
Use:
Call registers
AH = 01H
Return registers:
AL = Read character
It is very easy to read a character from the keyboard with this function, the hexadecimal code of the read
character is stored in the AL register. In case it is an extended register the AL register will contain the value of 0
and it will be necessary to call on the function again to obtain the code of that character.
0AH FUNCTION
Use:
Call registers:
AH = 0AH
DS:DX = Area of storage address
BYTE 0 = Quantity of bytes in the area
BYTE 1 = Quantity of bytes read
from BYTE 2 till BYTE 0 + 2 = read characters
Return characters:
None.
The characters are read and stored in a predefined space on memory. The structure of this space indicate that in
the first byte are indicated how many characters will be read. On the second byte the number of characters
already read are stored, and from the third byte on the read characters are written.
When all the indicated characters have been stored the speaker sounds and any additional character is ignored. To
end the capture of the chain it is necessary to hit [ENTER].
3FH FUNCTION
Use:
AH = 3FH
BX = Number assigned to the device
CX = Number of bytes to process
DS:DX = Address of the storage area
TOP
Return registers:
0FH FUNCTION
Use:
Call registers:
AH = 0FH
DS:DX = Pointer to an FCB
Return registers:
14H FUNCTION
Use:
Call registers:
AH = 14H
DS:DX = Pointer to an FCB already opened.
Return registers:
AL = 0 if there were no errors, otherwise the corresponding error code will be returned: 1 error at the end of the
file, 2 error on the FCB structure and 3 pa
What this function does is that it reads the next block of information from the address given by DS:DX, and dates
this register.
15H FUNCTION
Use:
Call registers:
AH = 15H
DS:DX = Pointer to an FCB already opened.
Return registers:
AL = 00H if there were no errors, otherwise it will contain the error code: 1 full disk or read-only file, 2 error on
the formation or on the specification of
The 15H function dates the FCB after writing the register to the present block.
16H FUNCTION
Use:
AH = 16H
DS:DX = Pointer to an already opened FCB.
Return registers:
AL = 00H if there were no errors, otherwise it will contain the 0FFH value.
21H FUNCTION
Use:
Call registers:
AH = 21H
DS:DX = Pointer to and opened FCB.
Return registers:
A = 00H if there was no error, otherwise AH will contain the code of the error: 1 if it is the end of file, 2 if there is
an FCB specification error and 3 if
This function reads the specified register by the fields of the actual block and register of an opened FCB and
places the information on the DTA, Disk Transfer Area.
22H FUNCTION
Use:
Call registers:
AH = 22H
DS:DX = Pointer to an opened FCB.
Return registers:
AL = 00H if there was no error, otherwise it will contain the error code: 1 if the disk is full or the file is an only
read and 2 if there is an error on the
It writes the register specified by the fields of the actual block and register of an opened FCB. It writes this
information from the content of the DTA.
3CH FUNCTION
Use:
Call registers:
AH = 3CH
CH = File attribute
DS:DX = Pointer to an ASCII specification.
Return registers:
CF = 0 and AX the assigned number to handle if there is no error, in case there is, CF ill be 1 and AX will contain
the error code: 3 path not found, 4 there This function substitutes the 16H function. The name of the file is
specified on an ASCII chain, which has as a characteristic being a conventional chain of bytes ended with a 0
character.
The file created will contain the attributes defined on the CX register in the following manner:
Value Attributes
00H Normal
02H Hidden
04H System
06H Hidden and of system
The file is created with the reading and writing permissions. It is not possible to create directories using this
function.
TOP
3DH FUNCTION
Use:
Call registers:
AH = 3DH
AL = manner of access
DS:DX = Pointer to an ASCII specification
Return registers:
BITS
7654321
. . . . 0 0 0 Only reading
. . . . 0 0 1 Only writing
. . . . 0 1 0 Reading/Writing
. . . x . . . RESERVED
3EH FUNCTION
Use:
Call registers:
AH = 3EH
BX = Assigned handle
Return registers:
CF = 0 if there were no mistakes, otherwise CF will be 1 and AX will contain the error code: 06H if the handle is
invalid.
This function dates the file and frees the handle it was using.
3FH FUNCTION
Use:
To read a specific quantity of bytes from an open file and store them on a specific buffer.
Syntax:
Int 10H
This interruption has several functions, all of them control the video
input/output, to access each one of them it is necessary that the function
number which is required at the moment of calling the interruption is in
the Ah register.
02h Function
Use:
Call registers:
AH = 02H
BH = Video page where the cursor is positioned.
DH = row
DL = Column
Return Registers:
None.
09h Function
Use:
Call registers:
AH = 09H
AL = Character to display
BH = Video page, where the character will display it;
BL = Attribute to use
number of repetition.
Return registers:
None
0Ah Function
Use:
Call registers:
AH = 0AH
AL = Character to display
BH = Video page where the character will display it
BL = Color to use (graphic mode only).
CX = number of repetitions
Return registers:
None.
The main difference between this function and the last one is that this one
doesn't allow modifications on the attributes neither does it change the
cursor position.
TOP
0EH Function
Use:
Call registers:
AH = 0EH
AL = Character to display
BH = Video page where the character will display it
BL = Color to use (graphic mode only).
Return registers:
None
AH = 00H
Return registers:
The proposal of the scan code is to use it with the keys without ASCII
representation as [ALT][CONTROL], the function keys and so on.
01h function
Use:
Call registers:
AH = 01H
Return registers:
Syntax:
Int 17H
00H Function
Use:
Call registers:
AH = 00H
AL = Character to print.
DX = Port to use.
Return registers:
The port to use is in the DX register, the different values are: LPT1 = 0,
LPT2 = 1, LPT3 = 2 ...
Most BIOS sport 3 parallel ports, although there are BIOS which sport 4
parallel ports.
01h Function
Use:
AH = 01H
DX = Port to use
Return registers:
AH = Printer status
Port to use is defined in the DX register, for example: LPT=0, LPT2=1, and
so on.
Most BIOS sport 3 parallel ports, although there are BIOS which sport 4
parallel ports.
02h Function
Uses:
Call registers:
AH = 01H
DX = Port to use
Return registers
AH = Printer status.
Port to use is defined in the DX register, for example: LPT=0, LPT2=1, and
so on
The state of the printer is coded bit by bit as follows:
Most BIOS sport 3 parallel ports, although there are BIOS which sport 4
parallel ports.
TOP
There are two ways to work with files, the first one is by means of file
control blocks or "FCB" and the second one is by means of communication
channels, also known as "handles".
The first way of file handling has been used since the CPM operative
system, predecessor of DOS, thus it assures certain compatibility with very
old files from the CPM as well as from the 1.0 version of the DOS, besides
this method allows us to have an unlimited number of open files at the same
time. If you want to create a volume for the disk the only way to achieve
this is by using this method.
Even after considering the advantages of the FCB, the use of the
communication channels it is much simpler and it allows us a better
handling of errors, besides, since it is much newer it is very probable
that the files created this way maintain themselves compatible through
later versions of the operative system.
FCB method
Introduction
There are two types of FCB, the normal, whose length is 37 bytes and the
extended one of 44 bytes.
On this tutorial we will only deal with the first type, so from now on when
I refer to an FCB, I am really talking about a 37 bytes FCB.
To select the work drive the next format is followed: drive A = 1; drive B
= 2; etc. If 0 is used the drive being used at that moment will be taken as
option.
The name of the file must be justified to the left and in case it is
necessary the remaining bytes will have to be filled with spaces, and the
extension of the file is placed the same way.
The current block and the current register tell the computer which register
will be accessed on reading or writing operations. A block is a gro of
128 registers. The first block of the file is the block 0. The first
register is the register 0, therefore the last register of the first block
would be the 127, since the numbering started with 0 and the block can
contain 128 registers in total.
Opening files
To open an FCB file the 21H interruption, 0FH function is used. The unit,
the name and extension of the file must be initialized before opening it.
The DX register must point to the block. If the value of FFH is returned on
the AH register when calling on the interruption then the file was not
found, if everything came out well a value of 0 will be returned.
If the file is opened then DOS initializes the current block to 0, the size
of the register to 128 bytes and the size of the same and its date are
filled with the information found in the directory.
For the creation of files the 21H interruption 16H function is used.
DX must point to a control structure whose requirements are that at least
the logic unit, the name and the extension of the file be defined.
In case there is a problem the FFH value will be returned on AL, otherwise
this register will contain a value of 0.
Sequential writing
The 1AH function does not return any state of the disk nor or the
operation, but the 15H function, which is the one we will use to write to
the disk, does it on the AL register, if this one is equal to zero there
was no error and the fields of the current register and block are dated.
Sequential reading
The 21H function and the 22H function of the 21H interruption are the ones
in charge of realizing the random readings and writings respectively.
The random register number and the current block are used to calculate
the relative position of the register to read or write.
The AL register returns the same information for the sequential reading of
writing. The information to be read will be returned on the transfer area
of the disk, likewise the information to be written resides on the DTA.
Closing a file
If after invoking this function, the AL register contains the FFH value,
this means that the file has changed position, the disk was changed or
there is error of disk access.
Channels of communication
When we use this method to work with files, there is no distinction between
sequential or random accesses, the file is simply taken as a chain of
bytes.
The functions used for the handling of files through handles are described
in unit 6: Interruptions, in the section dedicated to the 21H interruption.
Definition of procedure
A procedure is a collection of instructions to which we can direct the flow of our program, and
once the execution of these instructions is over control is given back to the next line to process of
the code which called on the procedure.
At the time of invoking a procedure the address of the next instruction of the program is kept on
the stack so that, once the flow of the program has been transferred and the procedure is done,
one can return to the next line
of the original program, the one which called the procedure.
Syntax of a Procedure
There are two types of procedures, the intrasegments, which are found on the same segment of
instructions, and the inter-segments which can be stored on different memory segments.
When the intrasegment procedures are used, the value of IP is stored on the stack and when the
intrasegments are used the value of CS:IP is stored.
To divert the flow of a procedure (calling it), the following directive is used:
CALL NameOfTheProcedure
For example, if we want a routine which adds two bytes stored in AH and AL
each one, and keep the addition in the BX register:
On the declaration the first word, Adding, corresponds to the name of out
procedure, Proc declares it as such and the word Near indicates to the MASM
that the procedure is intrasegment.
The Ret directive loads the IP address stored on the stack to return to the original program, lastly,
the Add Endp directive indicates the end of the procedure.
To declare an inter segment procedure we substitute the word Near for the
word FAR.
Call Adding
Macros
The main difference between a macro and a procedure is that in the macro
the passage of parameters is possible and in the procedure it is not, this
is only applicable for the TASM - there are other programming languages
which do allow it. At the moment the macro is executed each parameter is
substituted by the name or value specified at the time of the call.
TOP
Syntax of a Macro
Position 8, 6
Macro Libraries
One of the facilities that the use of macros offers is the creation of
libraries, which are groups of macros which can be included in a program
from a different file.
The macros file was saved with the name of MACROS.TXT, the
instruction Include would be used the following way: