COA Chap 5 for Evening

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Chapter 5

Basic Computer Organization and Design


Introduction
Computer technology has made incredible progress since the first general-purpose electronic
computer was created. Today, less than a thousand dollars will purchase a personal computer that
has more performance, more main memory, and more disk storage than a computer bought in
1980 for $1 million. This rapid rate of improvement has come both from advances in the
technology used to build computers and from innovation in computer design.
Computer designers largely dependent upon integrated circuit technology to improve the
performance of the computers. After the integrated circuit, the microprocessor was introduced.
The ability of the microprocessor to ride the improvements in integrated circuit technology has
been radically improved the performance of computer system.
Two significant changes in the computer marketplace made it easier than ever before to be
commercially successful with a new architecture.
 First, the virtual elimination of assembly language programming reduced the need for
object-code compatibility.
 Second, the creation of standardized, vendor-independent operating systems, such as
UNIX and its clone, Linux, lowered the cost and risk of bringing out a new architecture.
These changes made it possible to successfully develop a new set of architectures, called RISC
(Reduced Instruction Set Computer) architectures, in the early 1980s. The RISC-based machines
focused the attention of designers on two critical performance techniques:
 The use of instruction-level parallelism (initially through pipelining and later through
multiple instruction issue) and
 The use of caches (initially in simple forms and later using more sophisticated
organizations and optimizations).
The effect of this dramatic growth rate has been twofold.
 First, it has significantly enhanced the capability available to computer users. For many
applications, the highest performance microprocessors of today outperform the
supercomputer of less than 10 years ago.
 Second, the dominance of microprocessor-based computers across the entire range of the
computer design. Workstations and PCs have emerged as major products in the computer
industry. Minicomputers, which were traditionally made from off-the-shelf logic or from
gate arrays, have been replaced by servers made using microprocessors.
1
Mainframes have been almost completely replaced with multiprocessors consisting of
small numbers of off-the-shelf microprocessors. Even high-end supercomputers are being
built with collections of microprocessors.

The Task of a Computer Designer


The task the computer designer faces is a complex one:
 Determine what attributes are important for a new machine,
 Design a machine to maximize performance while staying within cost and power
constraints. This task has many aspects, including instruction set design, functional
organization, logic design, and implementation. The implementation may encompass
integrated circuit design, packaging, power, and cooling.
 Optimizing the design requires familiarity with a very wide range of technologies, from
compilers and operating systems to logic design and packaging.

The term instruction set architecture refers to the actual portion of the computer visible to the
programmer or compiler writer. The instruction set architecture serves as the boundary between
the software and hardware.

The implementation of a machine has two components: organization and hardware. The term
organization includes the high-level aspects of a computer’s design, such as the memory system,
the bus structure, and the design of the internal CPU (central processing unit where arithmetic,
logic, branching, and data transfer are implemented). For example, two processors with nearly
identical instruction set architectures but very different organizations are the Pentium III and
Pentium 4. Although the Pentium 4 has new instructions, these are all in the floating point
instruction set.
The term hardware is used to refer to the details of a machine, including the logic design and
the packaging technology of the machine.
Often a line of machines contains machines with identical instruction set architectures and
nearly identical organizations, but they differ in the detailed hardware implementation. For
example, the Pentium II and Celeron are nearly identical, but offer different clock rates and
different memory systems, making the Celron more effective for low-end computers. The word
architecture covers all three aspects of computer design: instruction set architecture,
organization, and hardware.

2
5.1. Timing and Control
Timing refers to the way in which events are coordinated on the bus. A bus in computer
terminology represents a physical connection used to carry a signal from one point to another.
Obviously, depending on the signal carried, there exist at least four types of buses: address, data,
control, and power buses.
Data buses carry data, control buses carry control signals, and power buses carry the power-
supply/ground voltage. In addition to carrying control signals, a control bus can carry timing
signals. These are signals used to determine the exact timing for data transfer to and from a bus;
that is, they determine when a given computer system component, such as the processor,
memory, or I/O devices, can place data on the bus and when they can receive data from the bus.
Buses use either synchronous timing or asynchronous timing:
 A bus can be synchronous if data transfer over the bus is controlled by a bus clock. The
clock acts as the timing reference for all bus signals.
 A bus is asynchronous if data transfer over the bus is based on the availability of the
data and not on a clock signal. Data is transferred over an asynchronous bus using a
technique called handshaking.
The operations of synchronous and asynchronous buses are explained below. To understand the
difference between synchronous and asynchronous, consider the case when a master such as a
CPU or DMA is the source of data to be transferred to a slave such as an I/O device. The
following is a sequence of events involving the master and slave:
1. Master: send request to use the bus
2. Master: request is granted and bus is allocated to master
3. Master: place address/data on bus
4. Slave: slave is selected
5. Master: signal data transfer
6. Slave: take data
7. Master: free the bus
Clocks are needed in sequential logic to decide when an element that contains state should be
updated. A clock is simply a free-running signal with a fixed cycle time; the clock frequency is
simply the inverse of the cycle time. The clock cycle time or clock period is divided into two
portions: when the clock is high and when the clock is low.

Figure 5.1 a clock signal oscillates between high and low values.
3
The clock period is the time for one full cycle. In an edge-triggered design, either the rising or
falling edge of the clock is active and causes state to be changed. The bus includes a clock line
upon which a clock transmits a regular sequence of alternating 1s and 0s of equal duration. A
single 1–0 transmission is referred to as a clock cycle or bus cycle and defines a time slot. All
other devices on the bus can read the clock line, and all events start at the beginning of a clock
cycle.
Figure 5.2 shows a typical, but simplified, timing diagram for synchronous read and write
operations. Other bus signals may change at the leading edge of the clock signal (with a slight
reaction delay). Most events occupy a single clock cycle. In this example, the processor places a
memory address on the address lines during the first clock cycle and may assert various status
lines.

Figure 5.2 Timing of Synchronous Bus Operations


Once the address lines have stabilized, the processor issues an address enable signal. For a read
operation, the processor issues a read command at the start of the second cycle. A memory
module recognizes the address and, after a delay of one cycle, places the data on the data lines.
The processor reads the data from the data lines and drops the read signal. For a write operation,
the processor puts the data on the data lines at the start of the second cycle, and issues a write
command after the data lines have stabilized. The memory module copies the information from
the data lines during the third clock cycle.
On the other hand, with asynchronous timing, the occurrence of one event on a bus follows and
depends on the occurrence of a previous event. In the simple read example of Figure 3.2a, the
processor places address and status signals on the bus.

4
After pausing for these signals to stabilize, it issues a read command, indicating the presence of
valid address and control signals. The appropriate memory decodes the address and responds by
placing the data on the data line. Once the data lines have stabilized, the memory module asserts
the acknowledged line to signal the processor that the data are available. Once the master has
read the data from the data lines, it deasserts the read signal. This causes the memory module to
drop the data and acknowledge lines. Finally, once the acknowledge line is dropped; the master
removes the address information.

Figure 5.3 (a) System bus read cycle

Figure 5.3 (b) System bus write cycle


Figure 5.3 Timing of Asynchronous Bus Operations
5.5. Memory reference instructions
The main concern of memory reference instructions is how to handle the addressing modes. The
address field or fields in a typical instruction format are relatively small. However, instructions
may able to reference a large range of locations in main memory or, for some systems, virtual
memory. To achieve this objective, a variety of addressing techniques has been employed.
The most common addressing techniques or modes are:

 Immediate  Register indirect


 Direct  Displacement
 Indirect  Stack
 Register

5
These modes are illustrated in Figure 5.4. Notations used:
A = contents of an address field in the instruction
R = contents of an address field in the instruction that refers to a register
EA = actual (effective) address of the location containing the referenced operand
(X) = contents of memory location X or register X

Figure 5.4a Immediate, direct and indirect address modes

Figure 5.4b Register, register indirect, displacement and stack


addressing modes.
Note that these two issues: First, virtually all computer
architectures provide more than one of these addressing
modes. The question arises as to how the processor can
determine which address mode is being used in a particular
instruction.

Several approaches are taken. Often, different opcodes will use different addressing modes. Also,
one or more bits in the instruction format can be used as a mode field. The value of the mode
field determines which addressing mode is to be used.

The second issue concerns the interpretation of the effective address (EA). In a system without
virtual memory, the effective address will be either a main memory address or a register. In a
virtual memory system, the effective address is a virtual address or a register. The actual

6
mapping to a physical address is a function of the memory management unit (MMU) and is
invisible to the programmer.

The following Table 5.1 indicates the address calculation performed for each addressing mode.

Mode Algorithm Advantage Disadvantage

Immediate Operand = A No memory reference Limited operand magnitude

Direct EA = A Simple Limited address space

Indirect EA = (A) Large address space Multiple memory references

Register EA = R No memory reference Limited address space

Register Indirect EA = (R) Large address space Extra memory reference

Displacement EA = A + (R) Flexibility Complexity

Stack EA = top of stack No memory reference Limited applicability

Table 5.1 Basic Addressing Modes

Immediate Addressing
The simplest form of addressing is immediate addressing, in which the operand value is present
in the instruction, Operand = A.
This mode can be used to define and use constants or set initial values of variables. Typically,
the number will be stored in twos complement form; the leftmost bit of the operand field is used
as a sign bit. When the operand is loaded into a data register, the sign bit is extended to the left to
the full data word size. In some cases, the immediate binary value is interpreted as an unsigned
nonnegative integer. The advantage of immediate addressing is that no memory reference other
than the instruction fetch is required to obtain the operand, thus saving one memory or cache
cycle in the instruction cycle. The disadvantage is that the size of the number is restricted to the
size of the address field, which, in most instruction sets, is small compared with the word length.

Direct Addressing
A very simple form of addressing is direct addressing, in which the address field contains the
effective address of the operand: EA = A.

7
The technique was common in earlier generations of computers but is not common on
contemporary architectures. It requires only one memory reference and no special calculation.
The obvious limitation is that it provides only a limited address space.

Indirect Addressing
With direct addressing, the length of the address field is usually less than the word length, thus
limiting the address range. One solution is to have the address field refer to the address of a word
in memory, which in turn contains a full-length address of the operand. This is known as indirect
addressing: EA = (A).
The parentheses are to be interpreted as meaning contents of. The obvious advantage of this
approach is that for a word length of N, an address space of 2N is now available. The
disadvantage is that instruction execution requires two memory references to fetch the operand:
one to get its address and a second to get its value. Although the number of words that can be
addressed is now equal to 2N, the number of different effective addresses that may be referenced
at any one time is limited to 2K, where K is the length of the address field.
Typically, this is not a burdensome restriction, and it can be an asset. In a virtual memory
environment, all the effective address locations can be confined to page 0 of any process.
Because the address field of an instruction is small, it will naturally produce low-numbered
direct addresses, which would appear in page 0. (The only restriction is that the page size must
be greater than or equal to 2K.) When a process is active, there will be repeated references to
page 0, causing it to remain in real memory. Thus, an indirect memory reference will involve, at
most, one page fault rather than two. A rarely used variant of indirect addressing is multilevel or
cascaded indirect addressing: EA = (c (A) c).

In this case, one bit of a full- word address is an indirect flag (I). If the I bit is 0, then the word
contains the EA. If the I bit is 1, then another level of indirection is invoked. There does not
appear to be any particular advantage to this approach, and its disadvantage is that three or more
memory references could be required to fetch an operand. Register Addressing Register
addressing is similar to direct addressing. The only difference is that the address field refers to a
register rather than a main memory address: EA = R To clarify, if the contents of a register
address field in an instruction is 5, then register R5 is the intended address, and the operand
value is contained in R5.
Typically, an address field that references registers will have from 3 to 5 bits, so that a total of
from 8 to 32 general-purpose registers can be referenced. The advantages of register addressing
8
are that (1) only a small address field is needed in the instruction, and (2) no time- consuming
memory references are required.

The memory access time for a register internal to the processor is much less than that for a main
memory address. The disadvantage of register addressing is that the address space is very
limited. If register addressing is heavily used in an instruction set, this implies that the processor
registers will be heavily used. Because of the severely limited number of registers (compared
with main memory locations), their use in this fashion makes sense only if they are employed
efficiently.

If every operand is brought into a register from main memory, operated on once, and then
returned to main memory, then a wasteful intermediate step has been added. If, instead, the
operand in a register remains in use for multiple operations, then a real savings is achieved. An
example is the intermediate result in a calculation. In particular, suppose that the algorithm for
twos complement multiplication were to be implemented in software.

Most modern processors employ multiple general-purpose registers, placing a burden for
efficient execution on the assembly-language programmer (e.g., compiler writer). Register
Indirect Addressing Just as register addressing is analogous to direct addressing, register indirect
addressing is analogous to indirect addressing. In both cases, the only difference is whether the
address field refers to a memory location or a register. Thus, for register indirect address, EA = (R).

The advantages and limitations of register indirect addressing are basically the same as for
indirect addressing. In both cases, the address space limitation (limited range of addresses) of the
address field is overcome by having that field refer to a wordlength location containing an
address. In addition, register indirect addressing uses one less memory reference than indirect
addressing.

Displacement Addressing
A very powerful mode of addressing combines the capabilities of direct addressing and register
indirect addressing. It is known by a variety of names depending on the context of its use, but the
basic mechanism is the same. We will refer to this as displacement addressing: EA = A + (R).
Displacement addressing requires that the instruction have two address fields, at least one of
which is explicit. The value contained in one address field (value = A) is used directly. The other

9
address field, or an implicit reference based on opcode, refers to a register whose contents are
added to A to produce the effective address.

We will describe three of the most common uses of displacement addressing:


 Relative addressing
 Base-register addressing
 Indexing

Relative addressing: For relative addressing, also called PC-relative addressing, the implicitly
referenced register is the program counter (PC). That is, the next instruction address is added to
the address field to produce the EA. Typically, the address field is treated as a twos complement
number for this operation. Thus, the effective address is a displacement relative to the address of
the instruction. Relative addressing exploits the concept of locality. If most memory references
are relatively near to the instruction being executed, then the use of relative addressing saves
address bits in the instruction.

Base-register addressing: For base- register addressing, the interpretation is the following: The
referenced register contains a main memory address, and the address field contains a
displacement (usually an unsigned integer representation) from that address. The register
reference may be explicit or implicit. Base-register addressing also exploits the locality of
memory references. It is a convenient means of implementing segmentation. In some
implementations, a single segment-base register is employed and is used implicitly.
In others, the programmer may choose a register to hold the base address of a segment, and the
instruction must reference it explicitly. In this latter case, if the length of the address field is K
and the number of possible registers is N, then one instruction can reference any one of N areas
of 2K words.
Indexing: For indexing, the interpretation is typically the following: The address field references
a main memory address, and the referenced register contains a positive displacement from that
address. Note that this usage is just the opposite of the interpretation for base- register
addressing. Of course, it is more than just a matter of user interpretation. Because the address
field is considered to be a memory address in indexing, it generally contains more bits than an
address field in a comparable base- register instruction. Also, we will see that there are some
refinements to indexing that would not be as useful in the base- register context. Nevertheless,
the method of calculating the EA is the same for both base-register addressing and indexing, and
10
in both cases the register reference is sometimes explicit and sometimes implicit (for different
processor types).

An important use of indexing is to provide an efficient mechanism for performing iterative


operations. Consider, for example, a list of numbers stored starting at location A. Suppose that
we would like to add 1 to each element on the list. We need to fetch each value, add 1 to it, and
store it back. The sequence of effective addresses that we need is A, A + 1, A + 2 … up to the
last location on the list. With indexing, this is easily done. The value A is stored in the
instruction’s address field, and the chosen register, called an index register, is initialized to 0.
After each operation, the index register is incremented by 1.
Because index registers are commonly used for such iterative tasks, it is typical that there is a
need to increment or decrement the index register after each reference to it. Because this is such
a common operation, some systems will automatically do this as part of the same instruction
cycle. This is known as auto-indexing. If certain registers are devoted exclusively to indexing,
then auto-indexing can be invoked implicitly and automatically. If general-purpose registers are
used, the auto-index operation may need to be signaled by a bit in the instruction. Auto-indexing
using increment can be depicted as follows. EA = A + (R) (R) d (R) + 1. In some machines, both
indirect addressing and indexing are provided, and it is possible to employ both in the same
instruction.

There are two possibilities: the indexing is performed either before or after the indirection. If
indexing is performed after the indirection, it is termed post-indexing: EA = (A) + (R).
First, the contents of the address field are used to access a memory location containing a direct
address. This address is then indexed by the register value. This technique is useful for accessing
one of a number of blocks of data of a fixed format. The operating system needs to employ a
process control block for each process. The operations performed are the same regardless of
which block is being manipulated. Thus, the addresses in the instructions that reference the block
could point to a location (value = A) containing a variable pointer to the start of a process control
block. The index register contains the displacement within the block. With pre-indexing, the
indexing is performed before the indirection: EA = (A + (R)). An address is calculated as with
simple indexing. In this case, however, the calculated address contains not the operand, but the
address of the operand. An example of the use of this technique is to construct a multiway branch
table. At a particular point in a program, there may be a branch to one of a number of locations
depending on conditions. A table of addresses can be set up starting at location A. By indexing
11
into this table, the required location can be found. Typically, an instruction set will not include
both pre-indexing and post-indexing.

Stack Addressing
A stack is a linear array of locations. It is sometimes referred to as a pushdown list or last-in-
first-out queue. The stack is a reserved block of locations. Items are appended to the top of the
stack so that, at any given time, the block is partially filled. Associated with the stack is a pointer
whose value is the address of the top of the stack. Alternatively, the top two elements of the stack
may be in processor registers, in which case the stack pointer references the third element of the
stack. The stack pointer is maintained in a register. Thus, references to stack locations in memory
are in fact register indirect addresses. The stack mode of addressing is a form of implied
addressing. The machine instructions need not include a memory reference but implicitly operate
on the top of the stack.

12

You might also like