COA Chap 5 for Evening
COA Chap 5 for Evening
COA Chap 5 for Evening
The term instruction set architecture refers to the actual portion of the computer visible to the
programmer or compiler writer. The instruction set architecture serves as the boundary between
the software and hardware.
The implementation of a machine has two components: organization and hardware. The term
organization includes the high-level aspects of a computer’s design, such as the memory system,
the bus structure, and the design of the internal CPU (central processing unit where arithmetic,
logic, branching, and data transfer are implemented). For example, two processors with nearly
identical instruction set architectures but very different organizations are the Pentium III and
Pentium 4. Although the Pentium 4 has new instructions, these are all in the floating point
instruction set.
The term hardware is used to refer to the details of a machine, including the logic design and
the packaging technology of the machine.
Often a line of machines contains machines with identical instruction set architectures and
nearly identical organizations, but they differ in the detailed hardware implementation. For
example, the Pentium II and Celeron are nearly identical, but offer different clock rates and
different memory systems, making the Celron more effective for low-end computers. The word
architecture covers all three aspects of computer design: instruction set architecture,
organization, and hardware.
2
5.1. Timing and Control
Timing refers to the way in which events are coordinated on the bus. A bus in computer
terminology represents a physical connection used to carry a signal from one point to another.
Obviously, depending on the signal carried, there exist at least four types of buses: address, data,
control, and power buses.
Data buses carry data, control buses carry control signals, and power buses carry the power-
supply/ground voltage. In addition to carrying control signals, a control bus can carry timing
signals. These are signals used to determine the exact timing for data transfer to and from a bus;
that is, they determine when a given computer system component, such as the processor,
memory, or I/O devices, can place data on the bus and when they can receive data from the bus.
Buses use either synchronous timing or asynchronous timing:
A bus can be synchronous if data transfer over the bus is controlled by a bus clock. The
clock acts as the timing reference for all bus signals.
A bus is asynchronous if data transfer over the bus is based on the availability of the
data and not on a clock signal. Data is transferred over an asynchronous bus using a
technique called handshaking.
The operations of synchronous and asynchronous buses are explained below. To understand the
difference between synchronous and asynchronous, consider the case when a master such as a
CPU or DMA is the source of data to be transferred to a slave such as an I/O device. The
following is a sequence of events involving the master and slave:
1. Master: send request to use the bus
2. Master: request is granted and bus is allocated to master
3. Master: place address/data on bus
4. Slave: slave is selected
5. Master: signal data transfer
6. Slave: take data
7. Master: free the bus
Clocks are needed in sequential logic to decide when an element that contains state should be
updated. A clock is simply a free-running signal with a fixed cycle time; the clock frequency is
simply the inverse of the cycle time. The clock cycle time or clock period is divided into two
portions: when the clock is high and when the clock is low.
Figure 5.1 a clock signal oscillates between high and low values.
3
The clock period is the time for one full cycle. In an edge-triggered design, either the rising or
falling edge of the clock is active and causes state to be changed. The bus includes a clock line
upon which a clock transmits a regular sequence of alternating 1s and 0s of equal duration. A
single 1–0 transmission is referred to as a clock cycle or bus cycle and defines a time slot. All
other devices on the bus can read the clock line, and all events start at the beginning of a clock
cycle.
Figure 5.2 shows a typical, but simplified, timing diagram for synchronous read and write
operations. Other bus signals may change at the leading edge of the clock signal (with a slight
reaction delay). Most events occupy a single clock cycle. In this example, the processor places a
memory address on the address lines during the first clock cycle and may assert various status
lines.
4
After pausing for these signals to stabilize, it issues a read command, indicating the presence of
valid address and control signals. The appropriate memory decodes the address and responds by
placing the data on the data line. Once the data lines have stabilized, the memory module asserts
the acknowledged line to signal the processor that the data are available. Once the master has
read the data from the data lines, it deasserts the read signal. This causes the memory module to
drop the data and acknowledge lines. Finally, once the acknowledge line is dropped; the master
removes the address information.
5
These modes are illustrated in Figure 5.4. Notations used:
A = contents of an address field in the instruction
R = contents of an address field in the instruction that refers to a register
EA = actual (effective) address of the location containing the referenced operand
(X) = contents of memory location X or register X
Several approaches are taken. Often, different opcodes will use different addressing modes. Also,
one or more bits in the instruction format can be used as a mode field. The value of the mode
field determines which addressing mode is to be used.
The second issue concerns the interpretation of the effective address (EA). In a system without
virtual memory, the effective address will be either a main memory address or a register. In a
virtual memory system, the effective address is a virtual address or a register. The actual
6
mapping to a physical address is a function of the memory management unit (MMU) and is
invisible to the programmer.
The following Table 5.1 indicates the address calculation performed for each addressing mode.
Immediate Addressing
The simplest form of addressing is immediate addressing, in which the operand value is present
in the instruction, Operand = A.
This mode can be used to define and use constants or set initial values of variables. Typically,
the number will be stored in twos complement form; the leftmost bit of the operand field is used
as a sign bit. When the operand is loaded into a data register, the sign bit is extended to the left to
the full data word size. In some cases, the immediate binary value is interpreted as an unsigned
nonnegative integer. The advantage of immediate addressing is that no memory reference other
than the instruction fetch is required to obtain the operand, thus saving one memory or cache
cycle in the instruction cycle. The disadvantage is that the size of the number is restricted to the
size of the address field, which, in most instruction sets, is small compared with the word length.
Direct Addressing
A very simple form of addressing is direct addressing, in which the address field contains the
effective address of the operand: EA = A.
7
The technique was common in earlier generations of computers but is not common on
contemporary architectures. It requires only one memory reference and no special calculation.
The obvious limitation is that it provides only a limited address space.
Indirect Addressing
With direct addressing, the length of the address field is usually less than the word length, thus
limiting the address range. One solution is to have the address field refer to the address of a word
in memory, which in turn contains a full-length address of the operand. This is known as indirect
addressing: EA = (A).
The parentheses are to be interpreted as meaning contents of. The obvious advantage of this
approach is that for a word length of N, an address space of 2N is now available. The
disadvantage is that instruction execution requires two memory references to fetch the operand:
one to get its address and a second to get its value. Although the number of words that can be
addressed is now equal to 2N, the number of different effective addresses that may be referenced
at any one time is limited to 2K, where K is the length of the address field.
Typically, this is not a burdensome restriction, and it can be an asset. In a virtual memory
environment, all the effective address locations can be confined to page 0 of any process.
Because the address field of an instruction is small, it will naturally produce low-numbered
direct addresses, which would appear in page 0. (The only restriction is that the page size must
be greater than or equal to 2K.) When a process is active, there will be repeated references to
page 0, causing it to remain in real memory. Thus, an indirect memory reference will involve, at
most, one page fault rather than two. A rarely used variant of indirect addressing is multilevel or
cascaded indirect addressing: EA = (c (A) c).
In this case, one bit of a full- word address is an indirect flag (I). If the I bit is 0, then the word
contains the EA. If the I bit is 1, then another level of indirection is invoked. There does not
appear to be any particular advantage to this approach, and its disadvantage is that three or more
memory references could be required to fetch an operand. Register Addressing Register
addressing is similar to direct addressing. The only difference is that the address field refers to a
register rather than a main memory address: EA = R To clarify, if the contents of a register
address field in an instruction is 5, then register R5 is the intended address, and the operand
value is contained in R5.
Typically, an address field that references registers will have from 3 to 5 bits, so that a total of
from 8 to 32 general-purpose registers can be referenced. The advantages of register addressing
8
are that (1) only a small address field is needed in the instruction, and (2) no time- consuming
memory references are required.
The memory access time for a register internal to the processor is much less than that for a main
memory address. The disadvantage of register addressing is that the address space is very
limited. If register addressing is heavily used in an instruction set, this implies that the processor
registers will be heavily used. Because of the severely limited number of registers (compared
with main memory locations), their use in this fashion makes sense only if they are employed
efficiently.
If every operand is brought into a register from main memory, operated on once, and then
returned to main memory, then a wasteful intermediate step has been added. If, instead, the
operand in a register remains in use for multiple operations, then a real savings is achieved. An
example is the intermediate result in a calculation. In particular, suppose that the algorithm for
twos complement multiplication were to be implemented in software.
Most modern processors employ multiple general-purpose registers, placing a burden for
efficient execution on the assembly-language programmer (e.g., compiler writer). Register
Indirect Addressing Just as register addressing is analogous to direct addressing, register indirect
addressing is analogous to indirect addressing. In both cases, the only difference is whether the
address field refers to a memory location or a register. Thus, for register indirect address, EA = (R).
The advantages and limitations of register indirect addressing are basically the same as for
indirect addressing. In both cases, the address space limitation (limited range of addresses) of the
address field is overcome by having that field refer to a wordlength location containing an
address. In addition, register indirect addressing uses one less memory reference than indirect
addressing.
Displacement Addressing
A very powerful mode of addressing combines the capabilities of direct addressing and register
indirect addressing. It is known by a variety of names depending on the context of its use, but the
basic mechanism is the same. We will refer to this as displacement addressing: EA = A + (R).
Displacement addressing requires that the instruction have two address fields, at least one of
which is explicit. The value contained in one address field (value = A) is used directly. The other
9
address field, or an implicit reference based on opcode, refers to a register whose contents are
added to A to produce the effective address.
Relative addressing: For relative addressing, also called PC-relative addressing, the implicitly
referenced register is the program counter (PC). That is, the next instruction address is added to
the address field to produce the EA. Typically, the address field is treated as a twos complement
number for this operation. Thus, the effective address is a displacement relative to the address of
the instruction. Relative addressing exploits the concept of locality. If most memory references
are relatively near to the instruction being executed, then the use of relative addressing saves
address bits in the instruction.
Base-register addressing: For base- register addressing, the interpretation is the following: The
referenced register contains a main memory address, and the address field contains a
displacement (usually an unsigned integer representation) from that address. The register
reference may be explicit or implicit. Base-register addressing also exploits the locality of
memory references. It is a convenient means of implementing segmentation. In some
implementations, a single segment-base register is employed and is used implicitly.
In others, the programmer may choose a register to hold the base address of a segment, and the
instruction must reference it explicitly. In this latter case, if the length of the address field is K
and the number of possible registers is N, then one instruction can reference any one of N areas
of 2K words.
Indexing: For indexing, the interpretation is typically the following: The address field references
a main memory address, and the referenced register contains a positive displacement from that
address. Note that this usage is just the opposite of the interpretation for base- register
addressing. Of course, it is more than just a matter of user interpretation. Because the address
field is considered to be a memory address in indexing, it generally contains more bits than an
address field in a comparable base- register instruction. Also, we will see that there are some
refinements to indexing that would not be as useful in the base- register context. Nevertheless,
the method of calculating the EA is the same for both base-register addressing and indexing, and
10
in both cases the register reference is sometimes explicit and sometimes implicit (for different
processor types).
There are two possibilities: the indexing is performed either before or after the indirection. If
indexing is performed after the indirection, it is termed post-indexing: EA = (A) + (R).
First, the contents of the address field are used to access a memory location containing a direct
address. This address is then indexed by the register value. This technique is useful for accessing
one of a number of blocks of data of a fixed format. The operating system needs to employ a
process control block for each process. The operations performed are the same regardless of
which block is being manipulated. Thus, the addresses in the instructions that reference the block
could point to a location (value = A) containing a variable pointer to the start of a process control
block. The index register contains the displacement within the block. With pre-indexing, the
indexing is performed before the indirection: EA = (A + (R)). An address is calculated as with
simple indexing. In this case, however, the calculated address contains not the operand, but the
address of the operand. An example of the use of this technique is to construct a multiway branch
table. At a particular point in a program, there may be a branch to one of a number of locations
depending on conditions. A table of addresses can be set up starting at location A. By indexing
11
into this table, the required location can be found. Typically, an instruction set will not include
both pre-indexing and post-indexing.
Stack Addressing
A stack is a linear array of locations. It is sometimes referred to as a pushdown list or last-in-
first-out queue. The stack is a reserved block of locations. Items are appended to the top of the
stack so that, at any given time, the block is partially filled. Associated with the stack is a pointer
whose value is the address of the top of the stack. Alternatively, the top two elements of the stack
may be in processor registers, in which case the stack pointer references the third element of the
stack. The stack pointer is maintained in a register. Thus, references to stack locations in memory
are in fact register indirect addresses. The stack mode of addressing is a form of implied
addressing. The machine instructions need not include a memory reference but implicitly operate
on the top of the stack.
12