Cpu Design

CHAPTER III
REGISTER LEVEL DESIGN –

A (RELATIVELY) SIMPLE PROCESSOR DESIGN
Section 1. Introduction
In Chapter 1 we studied circuit design at what is commonly referred to as the digital logic (or gate) level. Here the
fundamental unit for processing is a binary, digital, electronic signal, and the basic components of circuits are gates.
Connections among gates made with individual conducting lines.
As part of our study of digital logic circuits, however, we designed circuits capable of accepting multi-bit inputs,
producing multi-bit output, and undergoing multi-bit changes of state. Among some of these circuits are multi-bit
gates, multiplexers, decoders, bit-sliced alu's, various forms of registers (parallel in/out, counters, shift registers),
etc. These circuits are among the fundamental components for a level of circuit design immediately above the digital
logic level in a digital system design hierarchy. In recognition of the omnipresence of registers at this level, it is
commonly referred to as the register level (or register-transfer level).
In Chapter 2 we introduced one register-level structure when we introduced main memory as an integrated
collection of registers (which we call words or bytes when they are used in reference to memory). In this chapter we
take on another register-level structure when we introduce important concepts behind the design of central
processing units (CPU's), and incorporate these concepts into the design of a simple processor.
Recall that the primary function of a CPU is to execute programs expressed in the processor's own machine
language. During their execution programs and their accompanying data are stored wholly or in part in a main
memory M which lies outside the CPU. To actually execute the program the CPU must perform the following
actions:
1. Determine the address in M of the first (or next as is appropriate) machine language instruction I.
2. Fetch I from M by performing one or more memory read operations.
3. Decode I to determine the operation(s) to be performed.
4. If necessary fetch any operands required by I that are stored in M; again this will require one or more
memory read operations.
5. Perform the operations specified by I.
6. If required by I, store any results in M. This will require one or more memory write operations.
These steps comprise what is known as the instruction cycle or fetch-execute cycle. Steps 4, 5, and 6 together
constitute the execute phase of the instruction cycle.
During normal execution of a program the CPU repeatedly goes through the instruction cycle. The circuitry within
the CPU to implement this process consists of:
1. An appropriate sized alu.
2. A variety of registers for the temporary storage of addresses, instructions and data.
3. Control circuitry to properly sequence the transfers of data among the CPU's alu and internal registers that
are needed to implement the steps of the instruction cycle for each machine language instruction.
The organization used to connect the registers and CPU is often referred to as the data path of the CPU. On the
following page we show the data paths for some simple, hypothetical, CPU's. In section 2 we shall describe the data
path for our own hypothetical CPU (which we shall call the Relatively Simple CPU, or simply RSCPU) and use this
CPU as a vehicle for introducing processor organization principles.
CSCI 350 - Fall, 2004 Chapter 3 Page - 1

Section 2. Data Path for a Hypothetical CPU
Shown below is a variation of the diagram on page 246 of your textbook that represents the data path for the
Relatively Simple CPU that we shall design for this course. For simplicity here, we have omitted the control signals
for transferring data between the bus and the registers and for controlling the alu. We will describe these in detail
later, however.
8 bits to the data lines of memory
16 bits to the address lines of
memory
DR
AR
PC
IR
TR
AC 16-bit data bus

(shown as two sets of 8-bit lines)
X Y
alu
W
We note that similar to some of the organizations on the previous page, the data path for RSCPU uses a single 16-bit
data bus organization bus which we show represented as two 8-bit data lines. The low order bits of the bus are
shown on the right and the high order bits are shown on the left. The data path also includes the following
components:
1. a 16-bit address register (AR) to address words in main memory. The outputs of the address register connect it
to the address lines of the system bus connecting the CPU and memory. We also assume that AR has a control
line that, when activated (high) increments its current value by 1. We denote this operation by AR++.
2. a 16-bit program counter (PC) that contains the address of the next instruction to be executed (not the current
instruction), or the address of the next required operand of the current instruction. We also assume that PC has
a control line that, when activated (high) increments the current value of the PC by 1. We denote this operation
by PC++.
3. an 8-bit data register (DR) that serves as a data interface between the CPU and memory. It has separate data
lines to connect it to those of memory and separate control lines for interacting with memory. Note, in our data
path the output of DR is connected to the low order lines of the bus and is also connected directly to the input
lines of the IR and TR registers described below. Finally, we include circuitry (not shown, but essentially tri-
state buffers) to allow the output of DR to also be placed on the high-order lines of the bus.

4. An 8-bit ALU capable of performing various operations as shown in the table below. Here we assume that X and
Y are the operand inputs to the ALU and W represents the output. The operation of the ALU will be governed
the following seven control signals (ALU1,…,ALU7) according to the following table (here D -= “don’t care”):
ALU7 ALU6 ALU5 ALU4 ALU3 ALU2 ALU1 Function
0 D D 0 0 0 0 W = 0
0 D D 0 0 0 1 W = X
0 D D 0 0 1 0 W = Y
0 D D 0 0 1 1 W = X + Y
0 D D 0 1 0 0 W=Y
0 D D 0 1 0 1 W= X+Y
0 D D 1 0 0 0 W =1
0 D D 1 0 0 1 W= X+1
0 D D 1 0 1 0 W = Y +1
0 D D 1 0 1 1 W = X − Y (= X + Y + 1)
0 D D 1 1 0 0 W=Y+1
0 D D 1 1 0 1 W= X+Y+1
1 0 0 D D D D W = X ∧ Y
1 0 1 D D D D W = X ∨ Y
1 1 0 D D D D W = X ⊕ Y
1 1 1 D D D D W = X
On page 248 there is a diagram for implementing this ALU, where we represent the output of the bottommost
multiplexer by W and where X ↔ AC and Y ↔ BUS
5. an 8-bit accumulator register (AC) that receives the results of any arithmetic or logical operation and provides
the X operand for appropriate binary arithmetic or logical operations of the ALU. It is also the source
(destination) of any programmer-initiated data transfers to (from) memory. Note that while the output of AC is
routed to the data bus and the X input of the alu, it only receives its input from the alu.
6. an 8-bit general-purpose register (R) that supplies the Y operand for appropriate binary arithmetic or logical
operations that the ALU performs. It can also be used by a programmer to store data. It is capable of
bidirectional data transfers with the (low order lines of the) data bus
7. an 8-bit instruction register (IR) which contains a copy of the op-code of the current instruction. Note, IR is not
connected to the data bus. It receives its input from the output lines of DR, and as we shall see later, its output
is directed elsewhere.
8. an 8-bit temporary register (TR) which temporarily stores data during instruction execution. Note, the output
lines of TR go to the data path’s bus, but TR receives its input from the output lines of DR.
9. a 1-bit flag register Z that is set to 1 if the last arithmetic or logical operation produced a result equal to 0.

We are going to assume that RSCPU will implement the following machine language instruction set
Instruction Set for Relatively Simple CPU (RSCPU)

Op-code
Mnemonic Meaning Notes
(hex)
0000 0000 NOP No operation
0000 0001 LDAC Γ AC ← M[Γ] Γ is 16 bits; assume the 8
low-order bits are in the
word immediately after the
op-code, and the 8 high-
order bits are after the
low-order bits
0000 0010 STAC Γ M[Γ] ← AC Γ is 16 bits
0000 0011 MVAC R ← AC
0000 0100 MOVR AC ← R
0000 0101 JUMP Γ Go To Γ (or PC ← Γ) Γ is 16 bits
0000 0110 JMPZ Γ IF (Z = 1) THEN Go To Γ Γ is 16 bits
0000 0111 JPNZ Γ IF (Z = 0) THEN Go To Γ Γ is 16 bits
0000 1000 ADD AC ← AC + R, IF(AC+R=0) THEN Z=1 ELSE Z = 0
0000 1001 SUB AC ← AC - R, IF(AC-R=0) THEN Z=1 ELSE Z = 0
0000 1010 INAC AC ← AC + 1, IF(AC+1=0) THEN Z=1 ELSE Z = 0
0000 1011 CLAC AC ← 0 and Z ← 1
0000 1100 AND AC ← AC ∧ R, IF(AC∧R=0) THEN Z=1 ELSE Z = 0
0000 1101 OR AC ← AC ∨ R, IF(AC∨R=0) THEN Z=1 ELSE Z = 0C
0000 1110 XOR AC ← AC ⊕ R, IF(AC⊕R=0) THEN Z=1 ELSE Z = 0
0000 1111 NOT AC ← , AC ,IF( AC = 0 ) THEN Z=1 ELSE Z = 0
Data Path Timing and Register Transfers for RSCPU
In this subsection we develop the data transfers that will be needed to implement each of the above instructions using
the given data path. We assume that these data transfers will be coordinated by a clock pulse being transmitted
throughout the data path. It is important to know, however, just how much can be done in our data path in a single
clock pulse. This is shown below. What is most significant for us at this time about this clock cycle is that it is
significantly long to allow us to transfer data between registers via the data bus in one clock cycle. We also assume
the clock pulse is sufficiently long to allow a memory operation to complete.
ALU ops
complete
Activate Data
appropriate Data stablizes on Registers
goes loaded from bus
control the data bus, alu
on or memory
signals for output, and
data
data path or memory lines
bus
memory

We now complete this section by showing the sequences of registers transfers needed to implement each of the
RSCPU instructions. In showing these registers transfers we adopt the following notational conventions:
1. D ← S
Transfer the content of source register S to destination register D. If either D or S is a word in memory at
address Γ, we use the notation M[Γ] to denote this.
2. D1 ← S1, D2 ← S2
Perform the indicated transfers simultaneously.
3. D ← constant
Transfer the indicated value to destination register D.
4. D ← f(S1,...,Sn)
The function f is performed using the values of sources S1 ,..., Sn as operands and its value is transferred to
register D. If f is a binary operator (i.e. n = 2) we use infix notation rather than prefix notation.
5. IF (α) THEN register-transfer(s) ELSE register –transfer(s)

Perform the transfers(s) after THEN if control value α is high (=1); otherwise do the transfer(s) after ELSE
6. α: D1 ← S1 (, D2 ← S2, …)
Perform the given transfer(s) if control value α is high (=1).
Examples of registers transfers implementing the fetch phase and some instruction for RSCPU: We now give
examples of the register transfers needed to implement the instruction cycle for our CPU. In doing so we note the
following:
• If a memory read is indicated (by control signal read) the read operation starts at the end of the data path
cycle using the value in AR. Our discussion of the data path timing cycle showed that it is possible for AR
to attain a new value before the end of the cycle, however, meaning it is possible to change the value of AR
and initiate a read operation with this new value at the beginning of the next cycle.
• Continuing our discussion of memory reads, we shall assume that memory completes its operation within
one cycle. This means that following the initiation of a memory read, we can assume the data being read is
available in the DR at the end of the current timing cycle, but not at the beginning of this cycle.
Consequently we must wait until the next cycle before it can be used.
Examples:

Section 3 Register Gating and Sequencing of Register Transfers
From our discussion on registers, busses, we know that control signals are necessary to perform data transfers
between the various registers connected to a common bus. Similar signals must be incorporated into our data path.
We must also use input signals to the ALU and to have it perform whatever operations it must. Consequently we
shall require the following control signals (the understanding here is that the associated activity takes place when the
control signal has the value 1):
arload, pcload, drload, To control transfers from the data bus to the register with the same name. In the
rload cases of AR and PC the values on all 16 bus lines are used; in the case of DR and
R only the low-order lines are used.
pcbus, drhbus, drlbus, To control transfers to the data bus from the register with the same name. Note,
trbus, rbus, acbus drhbus transfers the value in DR to the high-order lines of the data bus, while
drlbus transfers the value in DR to the low-order lines. All other transfers are to
the low-order lines.
drmem, memdr To control the transfers between DR and memory (drmem is for writing to
memory, memdr is for loading DR with a memory value)
arinc, pcinc To increment the values of the AR and PC registers
trload, irload To transfer the value of DR to the TR or IR registers
acload, zload In the case of acload, to load AC with the output of the ALU; in the case of
zload, to give the flip-flop Z the value 1 if the output of the ALU is all 0s, and to
give it the value 0 otherwise (this is the same as loading Z with the NOR of the
output of the ALU)
ALU7,…,ALU0 To control the activity of the ALU as given in a previous table
read and write signals To control transfers between the CPU and memory
We now rework the register transfers of the previous section to incorporate these control signals into the
implementations of the instruction cycle.
Examples:

Control Units
As the previous examples show, the implementation of the instruction cycle of a CPU reduces to a sequence of
register transfers, which are in turn governed by the activation of the proper control signals. In order for any phase
of the instruction cycle to be properly implemented however, these control signals must be activated in the correct
sequence. It is the responsibility of the control unit of the CPU to see that the proper control signals are generated in
the correct sequence.
A. Hardwired Control Units

Here the correct control signals for each register transfer step of each phase of the instruction cycle are generated at
the proper time by a clock-driven counter operating with a decoder as shown below.
Each output line of this sequencer corresponds to a step in a register-transfer sequence, where 2n is equal to or
greater than the maximum number of steps that will ever be required. The required control signals will then be
generated by additional combinational circuitry which uses the following input signals:
a. the output of the step sequencer;
b. the content of the instruction register;
c. the content of any condition codes or status registers which may be incorporated into the cpu's design
(in our cpu these would be signals which tell us that the result of an alu operation was zero).
B. Microprogrammed Control Units

Here control signals are arranged into a unit known as a control vector, where each control vector specifies one or
more register transfers. Control vectors for a desired sequence of register transfers are them stored in a special high-
speed memory known as the control memory and a simpler, hardwired control unit known as the microcontrol unit
sees that the operations specified by these control vectors are performed in the proper sequence. Each phase of the
instruction cycle can be reduced to sequences of register transfers encoded in control vectors (or microinstructions)
and executed by the microcontrol unit in a way that parallels the way the control unit is supposed to execute
machine language instructions.
For the rest of this section we shall give examples to indicate how hardwired control signals can be generated for our
hypothetical CPU, and leave it to the reader to pursue the topic further. In the next section shall explore in more
detail the structure of a microprogrammed control unit for our CPU.
Hardwired Control Units
In our examples in the previous section we derived the sequences of control signals needed to implement the fetch
phase of the instruction cycle as well as each machine language instruction. We note that at most 8 steps are needed
to carry out any instruction (3 for fetching it and at most 5 steps to implement it). If we let T0 ,...,T7 denote the
(first) eight output lines of the counter-decoder circuit given on the previous page, representing lines which are
active (i.e. = 1) on the successive clock ticks, then we can specify the values that the control signals must have to
correctly implement the instruction cycle for our machine language instruction set by the control equations such as
the following:
acload = T3(MOVR+ADD+SUB+INAC+CLAC+AND+OR+XOR+NOT)+T7•LDAC
trbus = T5(LDAC+STAC+JUMP+JMPZ•Z+JPNZ• Z )
read = T1+T3(LDAC+STAC+JUMP+JMPZ•Z+JPNZ• Z )+T4(LDAC+STAC+JUMP+JMPZ•Z+JPNZ• Z )+T7•STAC
write = T7•STAC
ALU1 = T5(ADD+SUB+INAC)

Cpu Design

Uploaded by

Copyright:

Available Formats

Cpu Design

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cpu Design

Uploaded by

Copyright:

Available Formats

CHAPTER III

REGISTER LEVEL DESIGN –

CSCI 350 - Fall, 2004 Chapter 3 Page - 1

AC 16-bit data bus

CSCI 350 - Fall, 2004 Chapter 3 Page - 3

CSCI 350 - Fall, 2004 Chapter 3 Page - 4

Instruction Set for Relatively Simple CPU (RSCPU)

Data Path Timing and Register Transfers for RSCPU

CSCI 350 - Fall, 2004 Chapter 3 Page - 5

5. IF (α) THEN register-transfer(s) ELSE register –transfer(s)

CSCI 350 - Fall, 2004 Chapter 3 Page - 6

CSCI 350 - Fall, 2004 Chapter 3 Page - 7

A. Hardwired Control Units

B. Microprogrammed Control Units

Hardwired Control Units

CSCI 350 - Fall, 2004 Chapter 3 Page - 8

You might also like