CSA Complete

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 196

CSA

2/13/2012

Er. KAPIL PRASHAR

Instruction Code
The organization of a computer is defined by its internal registers, the timing and control structure, and the set of instructions that it uses. The internal organization of a digital system is defined by the sequence of micro-operations it performs on data stored in its registers. The general purpose digital computer is capable of executing various micro-operations and it can be instructed as to what specific instructions it must perform.
2/13/2012 Er. KAPIL PRASHAR 2

Continued
y The user of a computer can control the process by means of a program. y A program is a set of instructions that specify the operations, operands, and the sequence by which processing has to occur. y An instruction is a binary code that specifies a sequence of micro-operations. y Instructions and data are stored in memory. y The ability of store and execute instructions, the stored program concept (von Neumann architecture), is the most important property of a general-purpose computer.
2/13/2012 Er. KAPIL PRASHAR 3

Continued
An instruction code is a group of bits that instruct the computer to perform a specific operation (set of microoperations). Operation code is a basic part of instruction code; a group of bits that define such operations as add, subtract, multiply, shift, and complement. The operation code must consist of at least n bits for a given 2n (or less) distinct operations. Control unit receives the instruction from memory and interprets the operation code bits. It then issues a sequence of control signals to initiate MOs in internal registers. Er. KAPIL PRASHAR 2/13/2012 4

Continued
For every operation code, the control issues a sequence of MOs, needed for h/w implementation ofv that operation. An operation code is called macro-operation because it specifies a set of micro-operations. An instruction code specify also the registers or the memory words for operands and results
Memory words can be specified by their address Registers can be specified by a binary code of k bits specifying one of 2k possible registers.

Each computer (CPU) has its own instruction code format.


2/13/2012 Er. KAPIL PRASHAR 5

Continued
A simple computer organization
One register An instruction code format with two parts
Operation code An address: tells the control where to find an operand from memory. The data processed with data in register.

Fig. next Control reads 16-bit instruction from program memory. It uses the 12-bit address part of instruction to read 16-bit operand from data memory. It then executes the operation specified by the operation code. If an operation does not need an operand from memory, the address bits can be used for other purposes, e.g. clear AC, complement AC ( no need of address). 2/13/2012 Er. KAPIL PRASHAR 6

Stored Program Organization

2/13/2012

Er. KAPIL PRASHAR

Continued
When the second part of an instruction code specifies an operand(not address), the instruction is said to have an immediate operand. When the second part specifies an address of an operand, the instruction is said to have a direct address. Indirect address: the second part specifies a memory location where the address of the operand is found. Indirect address increases addressable memory size => more bits for specifying addresses of operands.
2/13/2012 Er. KAPIL PRASHAR 8

Indirect address mode bit Effective address

2/13/2012

Er. KAPIL PRASHAR

Computer Registers
Instructions are stored in consecutive memory locations and are executed sequentially one at a time. The control reads an instruction from a specific address in memory and executes it: after that next instruction is read and executed, and so on. Registers are needed for storing fetched instructions, and counters for computing the address of the next instruction.
2/13/2012 Er. KAPIL PRASHAR 10

Continued
Computer needs processor registers for data manipulation and holding addresses (see. Next Fig. and Table). Program counter (PC) goes through a counting sequence and causes the computer to read sequential instructions from memory. Instructions are read and executed in sequence unless a branch instruction is encountered
Calls for a transfer to a nonconsecutive instruction in the program The address part of a branch instruction becomes the address of the next instruction in PC Next instruction is read from the location indicated by PC

2/13/2012

Er. KAPIL PRASHAR

11

2/13/2012

Er. KAPIL PRASHAR

12

Continued
The basic computer has (see Fig. In previous slide):
8 registers 1 memory unit 1 control unit Common bus

The outputs of 7 registers and memory are connected to the common bus. Connections to bus lines are specified by selection lines S0, S1, and S2. A register load during the next clock pulse transition is selected with a LD (load) input. Memory write/read is enabled with write/read signals.
2/13/2012 Er. KAPIL PRASHAR 13

Figure : Basic Computer registers connected to a common bus


2/13/2012 Er. KAPIL PRASHAR 14

Continued
INPR receives a character from an input device. OUTR receives a character from AC and delivers it to an output device. Bus receives data from 6 registers and the memory unit. 5 registers have three control lines: LD (load), INR (increment), and CLR (clear): equivalent to a binary counter with parallel load and synchronous clear. 2 registers have only a LD input. AR is used to specify memory address: no need for an address bus. 16 inputs to AC come from an adder and logic circuit with three sets of inputs: AC output, DR, INPR.

2/13/2012

Er. KAPIL PRASHAR

15

Continued
y Content of any register can be applied onto the bus, and an operation can be performed in the adder and logic circuit during the same clock cycle. The clock transition at the end of the cycle transfers the content of the bus into the designated register and the output of the adder and logic circuit into AC, e.g.: DR AC and AC DR AC to the bus (S2S1S0 = 100), enabling the LD of DR, transferring DR into AC (through adder and logic unit), and enabling LD of AC, all during the same clock cycle. The two transfers occur upon the arrival of the clock pulse transition at the end of the clock cycle.

2/13/2012

Er. KAPIL PRASHAR

16

Computer Instructions
y
y y y y The basic computer has three 16-bit instruction code formats (see. Next Fig.). Opcode contains 3 bits and the meaning of the remaining 13 bits depends on the operation code encountered. A memory-reference instruction uses 12 bits to specify address and one bit to specify addressing mode I. The register-reference instructions are recognized by opcode 111 with 0 in bit 15. A Register-register instruction specifies an operation on or test of the AC register. An operand is not needed: 12 bits are used for specifying the operation or test to be executed.

2/13/2012

Er. KAPIL PRASHAR

17

Continued
y Input-output instruction is recognized by the opcode 111 with 1 in bit 15. Remaining 12 bits are used to specify the type of input-output operation or test performed. y Bits 12-15 are used to recognize the type of instruction. y If Bits 12-14 are not 111 the instruction is a memory reference type: I (bit 15) is taken as the addressing mode. y If bits 12-14 are 111, bit 15 is inspected for the type of instruction: 0 for register-reference and 1 for inputoutput instruction.

25 instructions (see. Next Table).

2/13/2012

Er. KAPIL PRASHAR

18

2/13/2012

Er. KAPIL PRASHAR

19

2/13/2012

Er. KAPIL PRASHAR

20

Instruction Set completeness


y Sufficient set of instructions for computing any function known to be computable. Three categories of instructions:
1. Arithmetic, logical, and shift instructions 2. Instructions for moving information to and from memory and processor registers 3. Program control instructions together with instructions that check status conditions 4. Input and output instructions
2/13/2012 Er. KAPIL PRASHAR 21

Continued
Arithmetic, logical, and shift instructions provide computational capabilities for processing data. All computation are done in processor registers: instructions for moving information between memory and registers are needed. Status checking (e.g. comparing magnitudes of two numbers) and program control instructions (e.g.branch) for altering the program flow. Input and output instructions for human-computer interaction: programs must be transferred into memory and the results of computations must be transferred to the user. Instructions in Table 5-2 constitute a minimum set.

2/13/2012

Er. KAPIL PRASHAR

22

Continued
y Addition and subtraction: ADD, CMA, INC. y Shifts: CIR, CIL y Multiplication and division: addition, subtraction, and shift. y Logic: AND, CMA, CLA => NAND => all logic operations with two variables. y Moving information: LDA, STA. y Branching and status checking: BUN, BSA, ISZ, and skip operations. y Input-output: INP, OUT

2/13/2012

Er. KAPIL PRASHAR

23

Continued
The instruction set of the basic computer is complete, but not efficient. An efficient set of instructions includes separate instructions for frequently used operations in order to perform them fast. Examples: OR, exclusive-OR, subtract, multiply, divide. These operation must be programmed in the basic computer .

2/13/2012

Er. KAPIL PRASHAR

24

Timing and Control


Timing for all registers is controlled by a master clock generator. Clock is applied to all flip-flops and registers in the system. Clock pulses do not change the state of a register unless it is enabled by a control signal generated in the control unit.

2/13/2012

Er. KAPIL PRASHAR

25

Continued
There are two major types of control organization: hardwire control and microprogrammed control
Hardwire organization (see next Fig.): the control logic is implemented with gates, flip-flops, decoders, and other digital circuits
Can be optimized to produce fast mode of operation Requires changes in the wiring if the design has to be modified

Microprogrammed organization: the control information is stored in a control memory (store)


The control memory is programmed to initiate the required
sequence of micro-operations Any required modifications can be done by updating the microprogram in control memory.

A microprogram is a program consisting of microcode that controls the different parts of a computer's central processing unit (CPU). The memory in which it resides is called a control store.

2/13/2012

Er. KAPIL PRASHAR

26

Continued
Block diagram of the (hardwire) control unit is shown in next Figure. (control logic derived later)

IR contains an instruction read from memory three parts: I-bit, opcode, bits 0-11

Opcode is decoded with a 3 x 8 decoder (outputs D0-D7)

I is transferred to a flip-flop 4-bit sequence counter (SC) provide the sequence of 16 timing signals synchronous clear and increment
When required, SC can be cleared (CLR signal enabled) by a suitable control logic, e.g. (see Fig.):

D3T4: SC 0 Control outputs are a function of all incoming signals to the control logic gates. SC enables sequential control outputs.

2/13/2012

Er. KAPIL PRASHAR

27

2/13/2012

Er. KAPIL PRASHAR

28

2/13/2012

Er. KAPIL PRASHAR

29

Continued
Memory read/write are initiated by a rising clock edge. It is assumed that memory access is completed in one clock cycle

assumption is often not valid in real computers because the memory cycle is usually longer that the clock cycle => wait cycles (states) must be provided until the memory word is available. No wait cycles in basic computer introduced here.

Next rising edge will load the memory word into a register.

2/13/2012

Er. KAPIL PRASHAR

30

Continued
It is important to understand the timing relationship between clock transition and the timing signals. For example, the register transfer statement: T0: AR PC specifies a transfer of the content PC into AR if the timing signal T0 is active. T0 is active an entire clock cycle. During this time interval the content of PC is placed onto the bus and LD input of AR is enabled. The actual transfer occurs at the end of the clock cycle when the clock goes through a positive transition (latches inputs to flip-flops). This same transition increments SC: the next clock cycle has T1 active and T0 inactive.

2/13/2012

Er. KAPIL PRASHAR

31

Instruction Cycle
A program consists of a sequence of instruction, and it resides in the memory.

Each instruction cycle in basic computer has following phases:


1. Fetch an instruction from memory 2. Decode the instruction 3. Read the effective address from memory if instruction defines an indirect address 4. Execute the instruction

After phase 4, the control jumps back to phase 1. This process continues until HALT instruction is encountered.
Er. KAPIL PRASHAR 32

2/13/2012

Fetch and Decode

Initially program counter PC in loaded with the address of the first instruction in the program. SC is cleared (i.e. timing signal T0 is active). SC is incremented after each clock pulse. Fetch and decode phases can be specified by following register transfer statements:

2/13/2012

Er. KAPIL PRASHAR

33

2/13/2012

Er. KAPIL PRASHAR

34

Continued
during T0:
1. Place the content of PC onto bus (S2S1S0 = 010 => 2) 2. Transfer the content of the bus to AR (enable LD input of AR) The next clock transition initiates transfer from PC to AR

during T1:
1. Enable the read input of memory 2. Place the content of memory onto the bus (S2S1S0 = 111 => 7) 3. Transfer the content of bus to IR (enable LD input of IR) 4. Increment PC (enable INR input of PC) The next clock transition initiates the read and increment operations

2/13/2012

Er. KAPIL PRASHAR

35

Continued
During T2:
1. Opcode is decoded by the 3 x 8 decoder 2. IR(0-11) is transferred to AR (address register) 3. IR(15) is latched to flip-flop I 2 and 3 occur at the end of the clock cycle

2/13/2012

Er. KAPIL PRASHAR

36

Determine the Type of Instruction


Timing signal is T3 (after decoding) During T3, the control unit determines the type instruction that was just read from memory (see Fig.). After the instruction has been executed SC is cleared and control returns to fetch phase with T0 = 1. It is assumed (not explicitly shown in transfer statements) that SC is incremented with every positive clock transition. When SC is cleared, SC. 0 statement is included.

2/13/2012

Er. KAPIL PRASHAR

37

2/13/2012

Er. KAPIL PRASHAR

38

Register-Reference Instruction
Recognized by the control when D7 = 1 and I = 0. Uses bits 0-11 of the instruction code to specify one of 12 instructions. The 12 bits are available in IR(0-11) and they were transferred to AR during time T2. See Table for control functions and microoperations for the register-reference instructions.
Each control function share Boolean relation D7I T3 (denoted by r)
The particular control function is indicated by one of the bits in IR(0-11) The execution of a register-reference instruction is completed at time T3:the sequence counter is cleared to 0 and control goes back to fetch the next instruction with timing signal T1.

2/13/2012

Er. KAPIL PRASHAR

39

2/13/2012

Er. KAPIL PRASHAR

40

Memory-Reference Instructions
Table lists the seven memory-reference instructions: the execution of each instruction requires a sequence of microoperations because data is stored in memory and cannot be processed directly. The effective address resides in AR and was placed there during timing signal T2 when I = 0, and T3 when I = 1 (see Fig.).

2/13/2012

Er. KAPIL PRASHAR

41

2/13/2012

Er. KAPIL PRASHAR

42

Continued
AND to AC: pair wise AND to bits in AC and the memory word specified by the effective address
D0T4: DR M[AR] D0T5: AC AC AND DR, SC 0

Output of the operation decoder = 0

2/13/2012

Er. KAPIL PRASHAR

43

Continued
ADD to AC: adds the content of the memory word specified by the effective address to the value of AC
D1T4: DR M[AR] D1T5: AC AC + DR, E Cout , SC 0 Output of the operation decoder = 1 Extended Accumulator

2/13/2012

Er. KAPIL PRASHAR

44

Continued
LDA: Load a memory word from a specified effective address to AC
D2T4: DR M[AR] D2T5: AC DR, SC 0 See Fig on slide 26: no direct path from the bus to AC: memory word is first read into DR whose content is then transferred into AC.

2/13/2012

Er. KAPIL PRASHAR

45

Continued
STA: store the content of AC into the memory word specified by the effective address
D3T4: M[AR] AC , SC 0

2/13/2012

Er. KAPIL PRASHAR

46

Continued
BUN: Branch unconditionally transfers the program to the instruction specified by the effective address. The next instruction is fetched and executed from the memory address given by the new value in PC
D4T4: PC AR , SC 0

BSA: Branch and save return address this instruction is useful for branching to a portion of a program called a subroutine or procedure
M[AR] PC, PC AR + 1 address of the next instruction in sequence (return address)
2/13/2012

address of the 1st instruction in the subroutine


47

Er. KAPIL PRASHAR

2/13/2012

Er. KAPIL PRASHAR

48

The BSA instruction performs the function usually referred to as a subroutine call. The indirect BUN instruction at the end of the subroutine performs the function referred as a subroutine return. In most commercial computers, the return address associated with the subroutine is stored in either a processor register of in a portion of memory called a stack.
A stack is a data structure that works on the principle of Last In First Out (LIFO). This means that the last item put on the stack is the first item that can be taken off, like a physical stack of plates. A stack-based computer system is one that is based on the use of stacks, rather than being register based.
2/13/2012 Er. KAPIL PRASHAR 49

BSA instruction

Continued
The BSA instruction must be executed with a sequence of two microoperations:
D5T4: D5T5: M[AR] PC, AR AR + 1 PC AR, SC 0

Timing signal T4 initiates a memory write operation, places the content of PC onto the bus, and enables the INR input of AR.
Memory write operation is completed and AR is incremented by the time the next clock transition occurs. The bus is used at T5 to transfer the content or AR to PC.

2/13/2012

Er. KAPIL PRASHAR

50

ISZ instruction
ISZ: Increment the word specified by the effective address, and if incremented value is equal to 0, PC is incremented by 1 D6T4: DR M[AR] D6T5: DR DR + 1 D6T6: M[AR] DR, if (DR = 0) then (PC PC+1), SC 0 Programmer usually stores a negative number (in 2 s complement) in the memory word. Repeated increments will eventually clear the memory word to 0. At that time PC is incremented by one in order to skip the next instruction in the program => can be used to create loops .

2/13/2012

Er. KAPIL PRASHAR

51

Continued
Flow chart showing microoperations for the seven memory-reference instructions is shown in Fig in the next slide.

2/13/2012

Er. KAPIL PRASHAR

52

2/13/2012

Er. KAPIL PRASHAR

53

I/O Instructions
Input Output devices Transmitter and Receiver Interface INPR + FGI OUTR + FGO AC Monitor Receiver FGO OUTR

Keyboard

Transmitter

INPR FGI

2/13/2012

Er. KAPIL PRASHAR

54

Input Output Instructions


p - D7IT3 Bi - IR(i)

p: INP OUT SKI SKO ION IOF pB11: pB10: pB9: pB8: pB7: pB6:

SC <- O AC(0-7) INPR, FGI O OUTR AC(0-7), FGO O If ( FGI = 1) then PC PC + 1 If ( FGO = 1) then PC PC + 1 IEN 1 IEN O

Clear SC Input Character Output Character Skip on input flag Skip on output flag Interrupt enable on Interrupt enable off

2/13/2012

Er. KAPIL PRASHAR

55

Interrupt Cycle
Instruction Cycle
0 R 1

Interrupt Cycle

Fetch and Decode

Save return Address M[0] PC

0 EXECUTE IEN 1 1 FGI 0 1 FGO 0 IEN R 0 0 Branch to Location 1 PC 1

2/13/2012

Er. KAPIL PRASHAR

56

Central Processing Unit (CPU)


It is the part of the computer that performs bulk of data-processing operation. It is made up of 3 major parts as shown below :
Register Control Arithmetic Logic Unit (ALU)
2/13/2012 Er. KAPIL PRASHAR 57

Components of the CPU


The Register set stores intermediate data used during the execution of the instructions. The Arithmetic Logic Unit (ALU) performs the required micro-operations for executing the instructions. The Control Unit supervises the transfer of information among the registers and instructs the ALU as to which operation to perform.
2/13/2012 Er. KAPIL PRASHAR 58

Stack Organization
A stack is a storage device for storing information in such a manner that the item stored last is the first item retrieved (LIFO last-in, first-out). The stack is a memory unit with an address register called a stack pointer (SP), which always points at the top item in the stack. The two operations of a stack are the insertion (push) and deletion (pop) of items. Push-operation increments the SP and pull-operation decrements the SP.

2/13/2012

Er. KAPIL PRASHAR

59

Stack Organization (contd)


Stack can reside in a portion of a large memory unit or it can be organized as a collection of a finite number of (fast) registers. Fig here shows an organization of 64word register stack.
2/13/2012 Er. KAPIL PRASHAR 60

Operations on Stack
Push (performed if stack is not full i.e. if FULL = 0):

2/13/2012

Er. KAPIL PRASHAR

61

Operations on Stack
Pop (performed if stack is not empty i.e. if EMTY = 0):

2/13/2012

Er. KAPIL PRASHAR

62

Stack implemented in the Computer Memory


Stack can also be implemented with RAM attached to a CPU: a portion of memory is assigned to a stack operation a processor register is used as a stack pointer Fig. next shows how a portion of memory partitioned into three segments: program, data, and stack. Most computer do not provide hardware for checking stack overflow or underflow if registers are used to store the upper limit (e.g. 3000) and the lower limit (e.g. 4000), then after push SP can be compared against the upper limit register, and after pull against the lower limit register. The advantage of the memory stack is that CPU can refer it without having to specify an address: the address in always in SP and automatically updated during a push or pop instruction.
2/13/2012 Er. KAPIL PRASHAR 63

2/13/2012

Er. KAPIL PRASHAR

64

Utility of using Stack(Evaluating Arithmetic Expressions)


A stack is effective for evaluating arithmetic expressions. Arithmetic operations are usually written in infix notation: each operator resides between the operands, e. g .: (A * B) + (C * D), where * denotes multiplication:
A * B and C * D has to be computed and stored. after the two products, sum (A * B) + (C * D) is computed

there is no straight forward way to determine the next operation that is performed.
2/13/2012 Er. KAPIL PRASHAR 65

Continued
Arithmetic expressions can be presented in prefix notation (also referred to as Polish notation by Polish mathematician Lukasiewicz): operators are placed before the operands. The postfix notation (reverse Polish notation (RPN) ) places the operator after the operand. E. g .: A + B , infix notation +AB , prefix notation AB+ , postfix notation (RPN)
2/13/2012 Er. KAPIL PRASHAR 66

Continued
The reverse Polish notation suits of stack manipulation. E. g. the expression A* B + C * D is written in RPN as AB* CD*+ and is evaluated by scanning from left to right: when operator is found, the operation is performed by using operands on the left side of the operator. The operator and operands are replaced by the result of operation. The scan is continued and the procedure is repeated for every operator:

2/13/2012

Er. KAPIL PRASHAR

67

Continued
1. * is found 2. Take the two operands from left: A and B 3. compute P = A * B 4. replace operands and operator with the result => PCD*+ 5. continue scan 6. * is found 7. Take the two operands from left: C and D 8. compute Q = C * D 9. replace operands and operator with the result => PQ+ 10. continue scan 11. + is found 12. take the two operands from left: P and Q 13. compute R = P + Q 14. replace operands and operator with the result: R 15. continue scan: no more operators => stop; R is the result of evaluation.

2/13/2012

Er. KAPIL PRASHAR

68

Conversion of Expressions from Infix to RPN


The conversion from infix to RPN must take into consideration the operational hierarchy of infix notation:
1. 2. 3. first perform arithmetic inside inner parentheses then inside outer parentheses perform multiplication and division before addition and subtraction.

E .g.: (A + B)*[ C*( D + E) + F] becomes AB+ DE+ C* F+* which is computed:


1. P = A+ B => PDE+ C* F+* 2. Q = D+ E => PQC* F+* 3. R = Q * C => PRF+* 4. S = R+ F => PS* 5. T = P* S T represents the result: T = AB+ DE+ C* F+*

2/13/2012

Er. KAPIL PRASHAR

69

Continued
RPN is the most efficient method known for evaluating arithmetic expressions. Used e. g. in electronic calculators. Stack is useful for evaluating arithmetic expressions in RPN
operands are pushed into the stack in the order of appearance (in RPN) the topmost operands are popped from the stack and used for the operation The result is pushed to replace the popped operands

Most compilers convert all arithmetic expressions into Polish notation: efficient translation of arithmetic expressions into machine language instructions.
2/13/2012 Er. KAPIL PRASHAR 70

Continued
E. g.: (3* 4)+( 5* 6) => 34* 56*+

2/13/2012

Er. KAPIL PRASHAR

71

Instruction Formats
The typical fields found in instruction formats are:
1. Operation code specifying the operation: add, subtract,complement, etc. 2. Address field designating a memory address or a register 3. Mode field for specifying the way for determining the effective address of an operand.

The number of address fields in the instruction format of a computer depends on the internal organization of its registers.

2/13/2012

Er. KAPIL PRASHAR

72

Continued
E.g.: MIPS (a RISC microprocessor architecture developed by MIPS Computer Systems Inc.) Register-Register

Register-immediate

Jump/Call

2/13/2012

Er. KAPIL PRASHAR

73

Three-Address Instruction
Computers may have instructions of several different lengths containing varying number of addresses. E.g. three-address instructions : This instruction format can use each address field to specify either a processor register or a memory word. (A+B)*(C+D): ADD R1, A, B R1 M[A] + M[B] ADD R2, C, D R2 M[C] + M[D] MUL X, R1, R2 M[X] R1 * R2
Advantage of this format is that it results in short programs when evaluating arithmetic expressions. Disadvantage is that the binary-coded instructions require too many bits to specify three addresses.

2/13/2012

Er. KAPIL PRASHAR

74

Two-Address Instruction
E.g. two-address instructions: Here also each address field can specify either a processor register or a memory word. Program for the previous example MOV R1, A R1 M[A] ADD R1, B R1 R1+ M[B] MOV R2, C R2 M[C] ADD R2, D R2 R2+ M[D] MUL R1, R2 R1 R1 * R2 MOV X, R1 M[X] R1 The first symbol listed in an instruction is assumed to be both a source and the destination where the result of the operation is transferred.

2/13/2012

Er. KAPIL PRASHAR

75

One-Address Instruction
One-Address Instruction use an implied accumulator (AC). Here we assume that Ac contains the result of the last operation. LOAD A AC M[A] ADD B AC AC + M[B] STORE T M[T] AC LOAD C AC M[C] ADD D AC AC + M[D] MUL T AC AC * M[T] STORE X M[X] AC

2/13/2012

Er. KAPIL PRASHAR

76

Zero-Address Instruction
A stack-organized computer does not use an address field for the instructions ADD and MUL. The PUSH and POP instructions, however, need an address field to specify the operand that communicates with the stack. The name zero-address is given to this type of instructions because of the absence of an address field in them. PUSH PUSH ADD PUSH PUSH ADD MUL POP A TOS A B TOS B TOS (A + B) C TOS C D TOS D TOS (C + D) TOS (C + D) * (A + B) X M[X] TOS

2/13/2012

Er. KAPIL PRASHAR

77

Addressing Modes
The addressing mode specifies a rule for interpreting or modifying the address field of the instruction before the operand is actually referenced. Addressing modes are used:
1. To provide programming versatility for the user: pointers to memory, counters for loop control, indexing data, etc. 2. To reduce the number of bits in the addressing field of the instruction.

The decoding phase of an instruction cycle determines the addressing mode(s) and the locations (registers and/or memory locations) of operands. Depending on the CPU, an instruction can have more than one address field, and each address field may be associated with its own particular addressing mode.

2/13/2012

Er. KAPIL PRASHAR

78

Different types of Addressing Mode


Implied mode: The operands are implicitly specified by the instruction, e.g.: complement accumulator. Immediate mode: The operand is specified in the instruction (in operand field). Can be used e.g. to initialize register to constant value (=immediate operand). Register mode: Operands are in registers that reside within the CPU. The particular register is selected with the register field of the instruction. Register indirect mode: The content of a register specifies the address of the operand in memory. Autoincrement or autodecrement mode: Similar to register indirect mode but the content of the register is automatically incremented/decremented after data access.
2/13/2012 Er. KAPIL PRASHAR 79

Different types of Addressing Mode


Implied mode: The operands are implicitly specified by the instruction, e.g.: complement accumulator . Immediate mode: The operand is specified in the instruction (in operand field). Can be used e.g. to initialize register to constant value (=immediate operand).

2/13/2012

Er. KAPIL PRASHAR

80

Continued
Direct/Absolute mode: The operand is in either a RF register or a MM location, whose address is explicitly given in the instruction. I.e., the EA of the operand is given in the instruction.

2/13/2012

Er. KAPIL PRASHAR

81

Continued
Indirect mode: The EA of the operand is in the register, or MM location, whose address is given in the instruction.

2/13/2012

Er. KAPIL PRASHAR

82

Continued
Index mode: The EA of the operand is generated by adding a constant value (given in the instruction) to the content of a register (specified in the instruction). This is used to address elements of an array . The starting address of is the constant and the index is contained in the register. Element can be addressed by this mode with different index .

2/13/2012

Er. KAPIL PRASHAR

83

Continued
Relative mode: Similar to index mode except the register is the PC. This is used to address an operand in an MM location whose address is specified relative to the current instruction. The EA is obtained by adding a constant (offset, or the displacement from current position to the location of the operand, can be negative) and the content of PC. The constant is either explicitly given in the instruction by the assembly programmer, or calculated by the assembler based on the knowledge of the MM locations of the program and the desired operand.

2/13/2012

Er. KAPIL PRASHAR

84

Continued

Base register addressing mode: The content of a base register is added to the address part of the instruction to obtain the effective address. The address part of the instruction gives the displacement relative to the base address. EA = address part of instruction + content of CPU register

2/13/2012

Er. KAPIL PRASHAR

85

Program Control
Name
Program flow can be altered by instructions that modify the value of the program counter: important feature of a digital computer provides a control over the program flow and capability for branching to different program segments. Typical program control instructions:

Mnemonic BR JMP SKP CALL RET CMP

Branch Jump Skip Call Return Compare( by subtracting) Test ( by ANDing)

2/13/2012

Er. KAPIL PRASHAR

TST

86

Continued
Branch and jump instructions may be conditional or unconditional

An unconditional branch instruction causes a branch to the specific


address without any conditions, e.g.: JMP DisplayGreeting The conditional branch specifies a condition, e.g. branch if zero: only when the condition is met, the program counter is loaded with the branch address, e.g.:

2/13/2012

Er. KAPIL PRASHAR

87

Continued
Compare and test instructions can be used in setting conditions for subsequent conditional branch instructions
Compare performs an arithmetic subtraction: result is not saved only status bit conditions are set as a result of operation. Similarly test performs logical AND of two operands and updates certain status bits.
2/13/2012 Er. KAPIL PRASHAR 88

Continued
The status register stores the values of the status bits (status register is composed of the status bits). Bits of the status register are modified as a result of an operation performed in the ALU. Status bits also known as CONDITION-CODE bits or FLAG bits
2/13/2012 Er. KAPIL PRASHAR 89

Continued
E.g. (8-bit ALU with a 4-bit status register):

2/13/2012

Er. KAPIL PRASHAR

90

Continued
Status bits can be checked after ALU operation to determine certain relationships that exist between the values of A and B V indicates overflow i.e. for 8-bit ALU the result is greater than 127 or less than -127. If Z is set, the result is zero: we can use e.g. XOR operation to compare to numbers (the result is zero iff A = B) and Z indicates the result of comparison. A single bit in A can be checked with a mask that contains 1 in that particular bit position (others being 0 s) and by using AND operation.
2/13/2012 Er. KAPIL PRASHAR 91

Continued
Conditional branch instructions use the status bits for checking conditions for branching:

2/13/2012

Er. KAPIL PRASHAR

92

Subroutine Call
For subroutine calls, different computers can use a different temporary location for storing the return address some computers use the first memory location of the subroutine (like the Basic Computer ). some store the return address in a fixed memory location. some computers use a processor register. stack memory is yet another possibility (the most efficient way): when a succession of subroutines is called (nested calls), the sequential return addresses can be pushed into the stack. The return from subroutine instruction pops the return address (and assigns to program counter) from the top of the stack: we always have the return address for the last called subroutine.
2/13/2012 Er. KAPIL PRASHAR 93

Continued
Subroutine call (stack based) microoperations:
SP SP 1 M[SP] PC PC effective address Decrement stack pointer Push content of PC onto the stack Transfer control to the subroutine Pop stack and transfer to PC Increment stack pointer

.. and return:
PC M[SP] SP SP + 1

By using subroutine stack each return address (in nested calls) can be pushed into the stack without destroying any previous values
e.g. in basic computer a recursive subroutine call would destroy the previous return address stored in the first memory location of the subroutine.

2/13/2012

Er. KAPIL PRASHAR

94

Program Interrupt
Program interrupt refers to the transfer of program control from a currently running program to another service program as a result of an external or internal generated request
otherwise similar to subroutine call, except:
1. The interrupt is (usually) initiated by an internal of external signal rather than an execution of an instruction (software interrupts are exceptions). 2. The address of the interrupt service program (routine) is determined by hardware rather than the address field of an instruction: the CPU must possess some form of HW procedure for selecting a branch address servicing the interrupt. 3. Interrupt routine stores all the information (not just PC) necessary to recover the state of the CPU prior the return from the interrupt routine.
Er. KAPIL PRASHAR 95

2/13/2012

Continued
After the interrupt routine the CPU must return exactly the same state that is was when the interrupt occurred. The state of the CPU at the end of the execute cycle (the interrupt is recognized in this phase) is determined from:
1. The content of PC 2. The content of all processor registers 3. The content of status conditions status bits (program status word PSW) stored in a separate status register. contains status information about the state of the CPU: bits from ALU operation, interrupt enable bits, and CPU operation mode (system mode, user mode), for example.

2/13/2012

Er. KAPIL PRASHAR

96

Continued
Some computer store only program counter (and PSW) prior entering to an interrupt routine
the interrupt routine must take care of storing and restoring the CPU status.

CPU does not respond to an interrupt until the end of an instruction execution
in an interrupt is pending control goes to a interrupt cycle. contents of PC and PSW are pushed onto stack. the branch address is transferred to PC and new PSW is loaded into the status register. the interrupt routine can now be executed starting from the branch address (which may contain a branch instruction to a user defined service routine). the last instruction of the interrupt routine is a return from interrupt : the stack is popped to retrieve PWS to status register and return address to PC => CPU state is restored and the interrupted program can proceed like nothing had happen.
2/13/2012 Er. KAPIL PRASHAR 97

Interrupt Types
1. External interrupts
from I/O, timing, or any other external source. e.g.: I/O device requesting new data, elapsed time of an event, power failure, etc.

2. Internal interrupts (traps)


from illegal or erroneous use of an instruction or data. e.g.: overflow, division by zero, invalid operation code, stack overflow, and protection violation. usually occur as a result of a premature termination of the instruction execution: the service program determines the corrective measure to be taken (e.g. terminates the program).

3. Software interrupts
initiated by an instruction (rather than HW signals) a special call instruction that behaves like an interrupt. can be used by a programmer to initiate an interrupt routine at any desired point in the program. can be used for accessing operating system services, for example.
2/13/2012 Er. KAPIL PRASHAR 98

Control Unit
It is a part of the CPU. Its purpose is to issue control signals that provide control inputs for the multiplexers in the common bus, control inputs in processor register, and microoperations for the accumulator. Two major types control organizations are there: hardwired control & microprogrammed control.

2/13/2012

Er. KAPIL PRASHAR

99

Hardwired & Microprogrammed Control


In the hardwired organization, the control signals are generated by hardware using logic design techniques, the control unit is said to be hardwired. In the microprogrammed organization, the control information is stored in a control memory. The control memory is programmed to initiate the required sequence of microoperations. The principle of microprogramming is an elegant and systematic method for controlling the microoperation sequences in a digital computer.

2/13/2012

Er. KAPIL PRASHAR

100

Continued
The control function that specifies a microoperation is a binary variable. When it is in active state, the corresponding microoperation is executed. A control variable in the opposite binary state, does not change the state of the registers in the system. The control variables at any given time can be represented by a string of 1 s and 0 s called the control word. In a bus-organized system, the control signals that specify microoperations are groups of bits that select the paths in multiplexers, decoders, and ALU.
2/13/2012 Er. KAPIL PRASHAR 101

Basic Concepts of Microprogramming:


Control word (CW): A word with each bit for one of the control signals. Each step of the instruction execution is represented by a control word with all of the bits corresponding to the control signals needed for the step set to one. Microinstruction: Each step in a sequence of steps in the execution of a certain machine instruction is considered as a microinstruction, and it is represented by a control word. All of the bits corresponding to the control signals that need to be asserted in this step are set to 1, and all others are set to 0 (horizontal organization).

2/13/2012

Er. KAPIL PRASHAR

102

Continued
Microprogram: Composed of a sequence of microinstructions corresponding to the sequence of steps in the execution of a given machine instruction. Since alteration of microprogram are not needed once the control unit is in operation, the control memory can be a read-only memory (ROM). Dynamic Microprogramming

2/13/2012

Er. KAPIL PRASHAR

103

Control Memory
A computer that employs a microprogrammed control unit will have two separate memories: a main memory and a control memory. The content of main memory may alter when the data are manipulated and every time that the program is changed. The control memory holds a fixed microprogram that cannot be altered by the occasional user.
2/13/2012 Er. KAPIL PRASHAR 104

Microprogrammed control unit

2/13/2012

Er. KAPIL PRASHAR

105

Continued
Sequencer or Next-address generator
Used to generate the address of the next microinstruction to be retrieved from the control memory.

Control Address Register


(CAR) Holds the address of the microinstruction generated by the sequencer; provides address inputs to the control memory.

Control Memory
(CM) Usually a ROM; holds the control words which make up the microprogram for the MCU.

Control Data Register (pipeline register)


(CDR) Holds the control word being retrieved presently; used to generate/propagate control function values to the MCU. Because the CAR and CDR are registers, they can be used and modified in parallel. Thus, the CDR can be causing the execution of a collection of microops at the same time that it's Er. KAPIL PRASHAR to generate the next address (via the being used 2/13/2012 106 sequencer) for the CAR.

Sequencing
Each machine instruction is executed through the application of a sequence of microinstructions. Clearly, we must be able to sequence these; the collection of microinstructions which implements a particular machine instruction is called a routine. The MCU typically determines the address of the first microinstruction which implements a machine instruction based on that instruction's opcode. Upon machine power-up, the CAR should contain the address of the first microinstruction to be executed. The MCU must be able to execute microinstructions sequentially (e.g., within routines), but must also be able to ``branch'' to other microinstructions as required; hence, the need for a sequencer. The microinstructions executed in sequence can be found sequentially in the CM, or can be found by branching to another location within the CM. Sequential retrieval of microinstructions can be done by simply incrementing the current CAR contents; branching requires determining the desired CW address, and loading that into the CAR.

2/13/2012

Er. KAPIL PRASHAR

107

Microprogramming Vs Hardwired Control


Hardwired:
composed of combinatorial and sequential circuits that generate complete timing that corresponds with execution of each instruction. time-consuming and expensive to design difficult to modify but fast

2/13/2012

Er. KAPIL PRASHAR

108

Continued
Microprogrammed:
Design is simpler problem of timing each instruction is broken down. Microinstruction cycle handles timing in a simple and systematic way. easier to modify slower than hardwired control In the microprogrammed control, any required changes or modifications can be done by updating the microprogram in control memory. Once the hardware configuration is established, there should be no need for further hardware or wiring changes.
2/13/2012 Er. KAPIL PRASHAR 109

RISC/CISC
Instruction set determines the way that machine language programs are constructed. Early computers had small and simple instruction sets in order to minimize the (expensive) hardware needed for their implementation. Today many computers have instructions that include 100 to 200 instructions
variety of data types large number of addressing modes
2/13/2012 Er. KAPIL PRASHAR 110

RISC/CISC
Complex instruction set computer (CISC) has complex hardware and large instruction set: functions from software to hardware. In contrast, reduced instruction set computer (RISC) uses fewer and simpler instructions which can be executed faster within the CPU. RISC chips require fewer transistors (than CISC), which makes them cheaper to design and produce. There is still considerable controversy among experts about the ultimate value of RISC architectures Its proponents argue that RISC machines are both cheaper and faster, and are
therefore the machines of the future. Skeptics note that by making the hardware simpler, RISC architectures put a greater burden on the software. They argue that this is not worth the trouble because conventional microprocessors are becoming increasingly fast and cheap anyway.

2/13/2012

Er. KAPIL PRASHAR

111

CISC
One reason for the trend to provide a complex instruction set is to simplify the translation from high-level to machine language programs. Variable length instruction formats. Register operands need less bits whereas memory operands need more bits. Memory manipulation

2/13/2012

Er. KAPIL PRASHAR

112

Characteristics of CISC architecture:


1. A large instruction set. 2. Instructions that perform special tasks and are used infrequently. 3. A large variety of addressing modes (5-20 different modes). 4. Variable-length instruction formats. 5. Instructions that manipulate operands in memory.
2/13/2012 Er. KAPIL PRASHAR 113

Characteristics of RISC architecture:


1. Relatively few instructions mostly register-to-register operations 2. Relatively few addressing modes (because of 1) 3. Memory access limited to load and store instructions. 4. All operations done within the register of the CPU. 5. Fixed-length, easily decoded instruction format aligned to word boundaries simplifies control logic. 6. Single-cycle instruction execution
fetch, decode, and execute phases for two to three instructions overlap: pipelining. Memory references may take more clock cycles.

7. Hardwired rather than microprogrammed control


faster
2/13/2012 Er. KAPIL PRASHAR 114

Other RISC characteristics:


1. A large number of register
useful for storing intermediate results and for optimizing operand
references: much faster than memory references. most frequent accessed operands are kept in registers.

2. Use of overlapped register windows to speed-up procedure call and return. 3. Efficient instruction pipeline 4. Compiler support for efficient translation of high-level language programs into machine language programs. A characteristic of some RISC processors is their use of overlapped register windows to provide the passing of parameters and avoid need for saving and restoring register values: speeds up procedure calls and returns.
2/13/2012 Er. KAPIL PRASHAR 115

Input/Output Devices
When using a computer the text of programs, commands to the computer and data for processing have to be entered. Also information has to be returned from the computer to the user. This interaction requires the use of input and output devices. The most common input devices used by the computer are the keyboard and the mouse. The keyboard allows the entry of textual information while the mouse allows the selection of a point on the screen by moving a screen cursor to the point and pressing a mouse button. Using the mouse in this way allows the selection from menus on the screen etc. and is the basic method of communicating with many current computing systems. Alternative devices to the mouse are tracker balls, light pens and touch sensitive screens. The most common output device is a monitor which is usually a Cathode Ray Tube device which can display text and graphics. If hardcopy output is required then some form of printer is used.

2/13/2012

Er. KAPIL PRASHAR

116

Keyboard
A computer keyboard is a peripheral modeled after the typewriter keyboard. Keyboards are designed for the input of text and characters, and also to control the operation of the computer. Physically, computer keyboards are an arrangement of rectangular or near-rectangular buttons, or "keys". Keyboards typically have characters engraved or printed on the keys; in most cases, each press of a key corresponds to a single written symbol. However, to produce some symbols requires pressing and holding several keys simultaneously, or in sequence; other keys do not produce any symbol, but instead affect the operation of the computer, or the keyboard itself.

2/13/2012

Er. KAPIL PRASHAR

117

Mouse
The mouse is a device that allows to control the movement of the insertion point on the screen. The operator places the palm of the hand over the mouse and moves it across a mouse pad, which provides traction for the rolling ball inside the device. Movement of the ball determines the location of the I beam on the computer screen. When the operator clicks the mouse the I beam becomes an insertion point which indicates the area you are working on the screen. You can also click the mouse and activate icons or drag to move objects and select text.

2/13/2012

Er. KAPIL PRASHAR

118

Monitor
A monitor's front is called a screen with a cathode ray tube (CRT) attached to the screen. The CRT contains an electronic gun that sends an electronic beam to a phosphorescent screen in front of the tube. To produce a pattern on the screen, a grid inside the CRT receives a variable voltage that causes the beam to hit the screen and make it glow at selected spots.

2/13/2012

Er. KAPIL PRASHAR

119

Printer
Provide a permanent record on paper of computer output data or text. Three basic types of printers are there :
Daisywheel : contains a wheel with the character placed along the circumference. To print a character, the wheel rotates to the proper position and an energized magnet then presses the letter against the ribbon. Dot matrix : contains a set of dots along the printing mechanism. Each dot can be printed or not depending on the specific character that are printed on the line. Laser printer : uses a rotating photographic drum that is used to imprint the character images. The pattern is then transferred onto the paper in same manner as a copying machine.

2/13/2012

Er. KAPIL PRASHAR

120

Magnetic tape
Are used mostly for storing files of data. Access is sequential and consists of records that can be accessed one after another as the tape moves along a stationary read-write mechanism. It is one of the cheapest and slowest method for storage and has the advantage that tapes can be removed when not in use.

2/13/2012

Er. KAPIL PRASHAR

121

Magnetic disks
Have high-speed rotational surfaces coated with magnetic material. Access is achieved by moving a read-write mechanism to a track in the magnetized surface. Disks are used mainly for bulk storage of programs and data.

2/13/2012

Er. KAPIL PRASHAR

122

Input-Output Interface
There are three ways that computer buses can be used to communicate with memory and I/O:

1. Use two separate buses, one for memory and other for I/O. 2. Use a common bus for memory and I/O but have separate control lines for each isolated I/O configuration. 3. Use one common bus for memory and I/O with common control lines: memory mapped I/O.
Follow this link for more study material: http://www.ustudy.in/ce/arch/u2
2/13/2012 Er. KAPIL PRASHAR 123

Continued
Some computers use one common bus to transfer information between memory or I/O devices and the CPU

Isolated I/O configuration


CPU has distinct input and output instructions (e.g. IN, OUT) address associated with the I/O instruction is place into the common address bus I/O read/write signal is enabled for initiating data transfer between CPU and selected register of an I/O device. Distinct read/write lines for I/O and memory.

2/13/2012

Er. KAPIL PRASHAR

124

Memory-mapped I/O
one set of read and write signals for I/O and memory no way to make difference between memory access and I/O access => memory and I/O devices share the available address space. no distinct I/O instructions: same instructions are used to manipulate memory words and I/O data.

2/13/2012

Er. KAPIL PRASHAR

125

Example of I/O Interface

2/13/2012

Er. KAPIL PRASHAR

126

Continued
The interface registers communicate with the CPU through the bi-directional data bus. Address bus is used to select the interface unit and the register. External circuit must be provided externally to generate chip select (CS) signal (e.g. from address). Register select inputs are usually connected to the two least significant lines of the address bus. The content of the selected register is transferred to CPU when the I/O read signal is enabled. CPU transfers data to a selected register when I/O write is enabled.
2/13/2012 Er. KAPIL PRASHAR 127

Asynchronous Data Transfer


If the registers in the I/O interface share a common clock with CPU, the transfer between the two units are said to be synchronous. In most cases the internal timing in each unit is independent from the other in that each unit uses its own private clock => units are said to be asynchronous. Asynchronous data transfer requires communication signals to be transmitted between the communicating units to indicate the time at which data is being transmitted.
2/13/2012 Er. KAPIL PRASHAR 128

Continued
One way is to use strobe signal
one of the unit indicates to the other unit when the transfer occurs:

2/13/2012

Er. KAPIL PRASHAR

129

Continued
Another way is to use handshaking
data is acknowledged by the receiving unit: sender knows whether the data has been successfully received or not

2/13/2012

Er. KAPIL PRASHAR

130

Continued
..destination initiated transfer using handshake

2/13/2012

Er. KAPIL PRASHAR

131

Continued
Handshaking allows arbitrary delays from one state to the next and permits each unit to respond at its own data rate.
If one unit is faulty, the data transfer cannot be completed
timeout can be used to detect this kind of error. internal timer is used to measure time: if other unit does not respond within a given time period, the unit assumes that an error has occurred.

2/13/2012

Er. KAPIL PRASHAR

132

Continued
The transfer of data can be parallel or serial
parallel transfer is fast but requires many wires: used for short distances and when speed is important. serial transfer is slower but requires only one pair of conductors.

A serial asynchronous data transmission technique used in many interactive terminals employs special bits that are inserted at both ends of the character code
each character consists of three parts
1. a start bit 2. the character bits (data) 3. ..and stop bits

when transmitter is idle the data line remains at high state (logic 1).

2/13/2012

Er. KAPIL PRASHAR

133

Continued
the first bit, called start bit, is always a 0 and is used to indicate the beginning of a character (data). The last bit called stop bit is always a 1. A transmitted character can be detected by the receiver from knowledge of transmission rules: 1. When data is not being send, the line is kept in the 1-state. 2. The initiation of data transmission is detected from the start bit, which is always 0. 3. The data bits always follow the start bit. 4. After the last bit of the data transmitted, a stop bit is detected when the line returns to the 1-state for at least one bit time line remains at 1-state until next start bit. the receiver knows the transfer rate and number of data bits: it can examine the line at proper times and receive valid bits.

2/13/2012

Er. KAPIL PRASHAR

134

Continued
E.g.: asynchronous serial transmission (8 data bits, 2 stop bits):

2/13/2012

Er. KAPIL PRASHAR

135

Continued
Baudrate is defined as the rate at which serial information is transmitted and is equivalent to the data transfer in bits per second: assume 10 characters per second, i.e. 10 * 11 bits/second (start + 8 data + 2stop), => baudrate is 110.

2/13/2012

Er. KAPIL PRASHAR

136

Modes of Transfer
Data transfer between the central computer and I/O devices may be handled in a variety of modes. 1. Programmed I/O 2. Interrupt-initiated I/O 3. Direct memory access (DMA).

2/13/2012

Er. KAPIL PRASHAR

137

Programmed I/O
Each data item transfer is initiated by an instruction in the program. Peripheral to CPU and CPU to Memory or vice versa. Transferring data under program control requires constant monitoring of the peripheral by the CPU. Once the data transfer is initiated, the CPU is required to monitor the interface to see when a transfer can again be made.
2/13/2012 Er. KAPIL PRASHAR 138

Interrupt-initiated I/O
Programmed I/O is time consuming method since it keeps the CPU in a loop until the I/O unit indicates that it is ready for data transfer. It can be avoided by using an interrupt facility and special commands to inform the interfaces to issue an interrupt request signal when the data are available from the device. In the meantime CPU can go with processing other programs. When the interface that keeps monitoring the device finds it ready, it generates an interrupt signal and the CPU then stops its task and branches to a service program to process the I/O transfer and then return to the task it was performing.

2/13/2012

Er. KAPIL PRASHAR

139

Direct Memory Access


In DMA, the interface transfers data into and out of the memory unit through the memory bus. CPU initiates the transfer by supplying the interface with the starting address and the number of words needed to be transferred. And then proceeds to execute other task. After the transfer is made, the DMA requests memory cycles through the memory bus. After memory controller grants the request the DMA transfer the data directly into memory.
2/13/2012 Er. KAPIL PRASHAR 140

DMA
The CPU and the DMA controller cannot use the system bus at the same time, so some way must be found to share the bus between them. One of two methods is normally used.
The DMA controller transfers blocks of data by halting the CPU and controlling the system bus for the duration of the transfer. The transfer will be as quick as the weakest link in the I/O module/bus/memory chain, as data does not pass through the CPU, but the CPU must still be halted while the transfer takes place.

1. Burst mode

2. Cycle stealing
The DMA controller transfers data one word at a time, by using the bus during a part of an instruction cycle when the CPU is not using it, or by pausing the CPU for a single clock cycle on each instruction. This may slow the CPU down slightly overall, but will still be very efficient.
2/13/2012 Er. KAPIL PRASHAR 141

2/13/2012

Er. KAPIL PRASHAR

142

DMA Controller
DMA controller is used to transfer the data between the memory and i/o device. The DMA controller needs the usual circuits to communicate with the CPU and i/o device. In addition to this, it needs an address register and address bus buffer. The address register contains an address of the desired location in memory. The word count register holds the number of words to be transferred. The control register specifies the mode of transfer. The DMA communicates with the i/o devices through the DMA request and DMA acknowledge line. The DMA communicates with the CPU through the data bus and control lines. The RD (Read) and WR (write) signals are bidirectional. When the BG (Bus Grant) signal are bidirectional. When the BG (Bus Grant) signal is 0, the CPU can communicate with the DMA registers through the data bus. When BG is 1, the CPU has relinquished the buses. The the DMA can communicate directly with the memory.

2/13/2012

Er. KAPIL PRASHAR

143

The connection between the DMA controller and other components in a computer system for DMA transfer is shown in figure.

2/13/2012

Er. KAPIL PRASHAR

144

DMA Transfer
The DMA request line is used to request a DMA transfer. The bus request (BR) signal is used by the DMA controller to request the CPU to relinquish control of the buses. The CPU activates the bus grant (BG) output to inform the external DMA that its buses are in a high-impedance state (so that they can be used in the DMA transfer.) The address bus is used to address the DMA controller and memory at given location The Device select (DS) and register select (RS) lines are activated by addressing the DMA controller. The RD and WR lines are used to specify either a read (RD) or write (WR) operation on the given memory location. The DMA acknowledge line is set when the system is ready to initiate data transfer. The data bus is used to transfer data between the I/O device and memory. When the last word of data in the DMA transfer is transferred, the DMA controller informs the termination of the transfer to the CPU by means of the interrupt line.

2/13/2012

Er. KAPIL PRASHAR

145

Channel I/O
This is a system traditionally used on mainframe computers, but is becoming more common on smaller systems. It is an extension of the DMA concept, where the DMA controller becomes a full-scale computer system itself which handles all communication with the I/O modules.

2/13/2012

Er. KAPIL PRASHAR

146

Memory System
Memory is Internal storage area in the computer. The term memory identifies data storage that comes in the form of chips, and the word storage is used for memory that exists on tapes or disks. Moreover, the term memory is usually used as a shorthand for physical memory, which refers to the actual chips capable of holding data. Some computers also use virtual memory, which expands physical memory onto a hard disk.
2/13/2012 Er. KAPIL PRASHAR 147

Different types of memory


1. 2. 3. 4. 5. Main memory Auxiliary memory Associative memory Virtual memory Cache memory

2/13/2012

Er. KAPIL PRASHAR

148

Memory Hierarchy
Hierarchy diagram
Magnetic tapes IO Processor Magnetic disks Cache memory
149

Main memory

CPU
2/13/2012 Er. KAPIL PRASHAR

Continued
The memory hierarchy system consists of all storage devices employed in a computer system from the slow but high capacity auxiliary memory to a relatively faster main memory, to an even smaller and faster cache memory accessible to the high speed processing logic. At the bottom of the hierarchy are the relatively slow magnetic tapes used to store removable files. Next are the magnetic disks used as backup storage.
2/13/2012 Er. KAPIL PRASHAR 150

Continued
The main memory occupies a central position by being able to communicate directly with the CPU and with auxiliary memory devices through an I/O processor. When programs not residing in main memory are needed by the CPU, they are brought in from auxiliary memory. Programs not currently needed in main memory are transferred into auxiliary memory to provide space for currently used programs and data. Cache is a very high speed memory. It increase the speed of processing by making current programs and data available to the CPU at a rapid rate. Cache is employed in the computer system to to compensate for the speed differential between main memory access time and processor logic which is usually very fast.

2/13/2012

Er. KAPIL PRASHAR

151

Memory Management
A program s machine language code must be in the computer s main memory in order to execute. Assuring that at least the portion of code to be executed is in memory when a processor is assigned to a process is the job of the memory manager of the operating system. This task is complicated by two other aspects of modern computing systems:
The first is multiprogramming. The second aspect is the need to allow the programmer to use a range of program addresses which may be larger, perhaps significantly larger than the range of memory locations actually available.

2/13/2012

Er. KAPIL PRASHAR

152

Multiprogramming
Multiprogramming mean that several (at least two) processes can be active within the system during any particular time interval. These multiple active processes result from various jobs entering and leaving the system in an unpredictable manner. Pieces, or blocks, of memory are allocated to these processes when they enter the system, and are subsequently freed when the process leaves the system. Therefore, at any given moment, the computer s memory, viewed as a whole, consists of a part of blocks, some allocated to processes active at that moment, and others free and available to a new process which may, at any time, enter the system. In general, then , programs designed to execute in this multiprogramming environment must be compiled so that they can execute from any block of storage available at the the time of the program s execution. Such program are called relocatable programs, and the idea of placing them into any currently available block of storage is called relocation.

2/13/2012

Er. KAPIL PRASHAR

153

nd 2

Aspect

The second aspect of modern computing systems affecting memory management is the need to allow the programmer to use a range of program addresses which may be larger, perhaps significantly larger than the range of memory locations actually available. That is, we want to provide the programmer with a virtual memory, with characteristics (especially size) different from actual memory, and provide it in a way that is invisible to the programmer. This is accomplished by extending the actual memory with secondary memory such as disk. Providing an efficiently operating virtual memory is another task for the memory management facility.

2/13/2012

Er. KAPIL PRASHAR

154

Relocation
Relocation of currently active programs is called dynamic relocation. If a currently executing process could be relocated, both the computer s response time and resource utilization could be improved. Actual implementation of dynamic relocation is not trivial. The compiler can not possibly assign the correct addresses because a program must be complied before it can be loaded and executed. Thus, program relocation, especially dynamic relocation the moving around of currently active processes must be done by the operating system s memory management facility. Given the needs for multiprogramming and virtual memory, and having the mechanism of dynamic relocation, it is time to take a serious look at how one might design the actual memory manger.

2/13/2012

Er. KAPIL PRASHAR

155

Actual memory management Creating and maintain an environment which will sustain both

multiprogramming and virtual memory consists basically of designing a memory management program which will facilitate the timely movement of blocks of program code into portions of main memory when they are about to be executed, and out of main memory to secondary memory (disk) when they are no longer needed. There are basically three approaches to this problem. In the fist approach, called swapping, all of the code for a particular process is transferred into main storage prior to dispatching the processor to the process. When the process becomes blocked or its time slice used up, the entire block of code is again swapped out to secondary storage to be replaced by the block of code representing the next process to assume control of the processor, and so on. This approach, while reasonable when the size of main memory is limited, obviously causes a substantial execution delay overhead during the swapping itself. This overhead cost can sometimes be ameliorated by alternative approaches, which move parts of the code for processes rather than code for entire processes.

2/13/2012

Er. KAPIL PRASHAR

156

Continued
The other two approaches are segmentation and paging. Both recognize the fact that only that portion of a process code which is about to execute actually needs to be in main storage at any particular time. These approaches have two major advantages over swapping.
First, if just a part of a currently executing process needs to be in main memory at a given time, then it follows that parts of more process can be simultaneously in main store, and thus a greater degree of multiprogramming can be facilitated in the system. The term degree of multiprogramming refers to the number of processes currently active within the system.
2/13/2012 Er. KAPIL PRASHAR 157

Continued
The second advantage of segmentation or paging is that the capability to move just part of programs allows part of a program to be loaded into memory and executed, and then be replaced by another part to be requiring a very large amount of memory, perhaps in total, more than the capacity of the computer s main store. But this would be, then, an implementation of virtual memory, as it has been described above.

2/13/2012

Er. KAPIL PRASHAR

158

Continued
Segmentation and paging differ from one another primarily in the way the code for a particular process is divided. In segmentation, a program code is divided into number of variable sized blocks corresponding to the logical structure of the program, such as procedures, functions and data segments. Paging, on the other hand, divides the program code into fixed blocks, called pages. It is evident that the more logical subdivision of segmentation makes program linking easier, while the fixed blocks of paging, being each interchangeable with the other, makes memory management easier. In either case, since portions of program s code are being moved around during a program s execution, something like a hardware relocation register will be needed to compute actual addresses in order to avoid unacceptable slowdown in program execution times.

2/13/2012

Er. KAPIL PRASHAR

159

Main Memory (RAM)


The memory unit that communicates directly with the CPU is called the main memory. It is a relatively large and fast memory to store programs and data during the computer operation. The principal technology used for the main memory is based on semiconductor integrated circuits. Integrated circuits RAM (Random Access Memory) is a read / write memory. Data can be read from or written into the memory in a random access mode. However, it is a volatile device. Data will be lost if power is off. There are two basic types of RAM in use today, dynamic RAM and Static RAM.

2/13/2012

Er. KAPIL PRASHAR

160

Dynamic RAM
Dynamic RAMs(DRAM) are designed for high capacity, moderate speeds, and low power consumption. Their memory cells are basically charge-storage capacitors with driver transistors. The presence or absence of charge in a capacitor is interpreted by the sense line of the RAM as 1 or 0. The charge in a capacitor has tendency to discharge itself, therefore dynamic RAMs are required periodic charging to maintain the data stored. This periodic charging is called refreshing.

2/13/2012

Er. KAPIL PRASHAR

161

Static RAM
Static RAMs(SRAM) are made of flip-flops and logic gates. Since a flip-flop is a bistable elements, it can be used to store the binary values. Operations of flip-flops are fast and do not require refreshing. However, due to the unit complexity, the capacity of static RAM is low compared with dynamic RAM. For the same reason, power consumption and the cost of unit storage are high, too.
2/13/2012 Er. KAPIL PRASHAR 162

Comparison of the different characteristics between dynamic & static RAMs.


PARAMETER Power Consumption Capacity Price Refreshing Speed DRAM Low High Cheap Required Slow SRAM High Low Expensive Not required Fast

2/13/2012

Er. KAPIL PRASHAR

163

More types of RAM


SDRAM (synchronous dynamic RAM) :
SDRAM can handle bus speeds of up to 100 MHz, and these are fast approaching. SDRAM is synchronized with the system clock itself, a technical feat that has eluded PC engineers until now. SDRAM technology allows two pages of memory to be opened simultaneously

2/13/2012

Er. KAPIL PRASHAR

164

More types of RAM


DDR (Double Data Rate SDRAM) DDR basically doubles the rate of data transfer of standard SDRAM by transferring data on the up and down tick of a clock cycle. DDR memory operating at 333MHz actually operates at 166MHz * 2 (aka PC333 / PC2700) or 133MHz*2 (PC266 / PC2100). DDR is a 2.5 volt technology that uses 184 pins in its DIMMs. It is incompatible with SDRAM physically, but uses a similar parallel bus, making it easier to implement than RDRAM, which is a different technology.

2/13/2012

Er. KAPIL PRASHAR

165

Read Only Memory (ROM)


Read Only Memories (ROM) are those in which the data is permanently programmed either at the time of manufacture or by the user prior to the memory being installed. They are non-volatile memories. Still, they are random access devices. There are different types of ROMs. The read only refers to the applications at the end user s view. Writing the information into the ROM is required prior to use.
2/13/2012 Er. KAPIL PRASHAR 166

Types of ROM
1. Standard ROMs : Standard ROMs are programmed by the manufacturer. Users can only read the data or execute programs in the ROM. Usually, standard ROMs store certain standard applications for general user applications. 2. Programmable ROMs : Programmable ROMs (PROM) can be programmed permanently by the user or distributor using special equipment. They can only be programmed once. Before the data is written into the PROM, users should verify the correctness of the contents.
2/13/2012 Er. KAPIL PRASHAR 167

Types of ROM (Cont)


3. Erasable Programmable ROMs : Erasable programmable ROMs (EPROM) can be programmed and erased by the user for many times. Erasure is carried out by shining high intensity ultra-violet light through a special transparent window at the top of the memory IC. Erasing and writing to the ROM are assisted by a special device called EPROM writer.

4. Electrically Erasable Programmable ROMs : Electrically erasable programmable ROMs (EEPROM) are similar to the EPROM. Instead of erasure of entire chip, user can erase a single bit electrically in one operation. Again, the operations require a special equipment.

2/13/2012

Er. KAPIL PRASHAR

168

Auxiliary Memory
Devices that provide backup storage. When information not residing in main memory is required by the CPU, they are brought in from auxiliary memory Programs not currently needed in main memory are transferred to auxiliary memory to provide space for currently used programs and data. Most common auxiliary memory devices used in computer systems are magnetic disks and tapes. Other components used, but not as frequently, are magnetic drums, magnetic bubble memory, and optical disks.

2/13/2012

Er. KAPIL PRASHAR

169

Magnetic disks
Is a circular plate constructed of metal or plastic coated with magnetized material. Often both sides of the disks are used and several disks may be stacked on one spindle with read/write heads available on each surface . Bits are stored in the magnetized surface in spots along concentric circles called tracks. the tracks are commonly divided into sections called sectors. In most systems, the minimum quantity of information which can be transferred is a sector.
2/13/2012 Er. KAPIL PRASHAR 170

Continued
In units using single read/write head for each disk surface, the track address bits are used by a mechanical assembly to move the head into the specified track position before reading or writing. In other disk systems, separate read/write heads are provided for each track in each surface.
2/13/2012 Er. KAPIL PRASHAR 171

Magnetic Disk

2/13/2012

Er. KAPIL PRASHAR

172

Continued
A disk system is addressed by address bits that specify the disk number, the disk surface, the sector number and the track within the sector. After the read/write heads are positioned in the specified track, the system has to wait until the rotating disk reaches the specified sector under the read/write head. Information transfer is very fast once the beginning of a sector has been reached.
2/13/2012 Er. KAPIL PRASHAR 173

Continued
Disks that are permanently attached to the unit assembly and cannot be removed by the occasional user are called hard disks. Disk drive with removable disks is called a floppy disk.

2/13/2012

Er. KAPIL PRASHAR

174

Magnetic Tape
It is a strip of plastic coated with a magnetic recording medium. Bits are recorded as magnetic spots on the tape along several tracks. Read/write heads are mounted one in each track so that data can be recorded and read as a sequence of characters. These units can be stopped, started to move forward or in reverse, or can be rewound. They cannot be started or stopped fast enough between individual characters. Information is recorded in blocks referred to as records. Gaps of unrecorded tape are inserted between records where the tape can be stopped. Each record on tape has an identification bit pattern at the beginning and end.

2/13/2012

Er. KAPIL PRASHAR

175

Associative memory
The time required to find an item stored in memory can be reduced considerably if stored data can be identified for access by the content of the data itself rather than by an address. A memory unit accessed by content is called an associative memory or content addressable memory (CAM). During writing of word in CAM no address is given. The memory is capable of finding an empty unused location to store the word. During reading of a word from the CAM, the content of the word, or part of the word, is specified. The memory locates all words which match the specified content and marks them for reading.

2/13/2012

Er. KAPIL PRASHAR

176

Continued
Associative memories are more expensive than a RAM because each cell must have storage capability as well as logic circuits for matching its content with an external argument.

2/13/2012

Er. KAPIL PRASHAR

177

Hardware Organization
Block Diagram of Associative Memory: Argument register (A) Key register (K) Input Read write
Associative memory Array and logic
Match register

M
m words n bits per word

2/13/2012

Output Er. KAPIL PRASHAR

178

Continued
The block diagram of Associative memory consists of a memory array and logic for m words with n bits per word. The argument register A and key register K each have n bits, one for each bit of a word. The match register M has m bits, one for each memory word. Each word in memory is compared in parallel with the content of A. The words that match the bits of A set a corresponding bit in M. Reading is accomplished by a sequential access to memory for those words whose corresponding bits in M have been set. K provides a mask for choosing a particular field or key in argument word. The entire argument is compared with each memory word if K contains all 1 s. Otherwise, only those bits in the argument that have 1 s in their corresponding position of K are compared.

2/13/2012

Er. KAPIL PRASHAR

179

Continued
Example: A K word 1 word 2 101 111100 111 000000 100 111100 101 000001

no match match

2/13/2012

Er. KAPIL PRASHAR

180

Continued
Match logic: The match logic for each word can be derived from the comparison algorithm. Read operation: The bits of the match register are scanned matched one at a time and the matched words having 1 in the corresponding bit position of M are read in sequence by applying a read signal to each word line whose corresponding Mi bit is a 1. Write operation: a special register named tag register is used for this purpose.
2/13/2012 Er. KAPIL PRASHAR 181

Cache memory
Locality of Reference
During the course of the execution of a program, memory references tend to cluster e.g. loops

2/13/2012

Er. KAPIL PRASHAR

182

Cache
Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module

2/13/2012

Er. KAPIL PRASHAR

183

Cache operation - overview


CPU requests contents of memory location Check cache for this data If present, get from cache (fast) If not present, read required block from main memory to cache Then deliver from cache to CPU Cache includes tags to identify which block of main memory is in each cache slot
2/13/2012 Er. KAPIL PRASHAR 184

Cache Design
Size Mapping Function Replacement Algorithm Write Policy Block Size Number of Caches

2/13/2012

Er. KAPIL PRASHAR

185

Size does matter


Cost
cache is expensive

Speed
cache is faster (up to a point) Checking cache for data takes time

Small enough that the average cost is about that of RAM alone Large enough that access time is about that of cache alone
2/13/2012 Er. KAPIL PRASHAR 186

Write Policy
Must not overwrite a cache block unless main memory is up to date Multiple CPUs may have individual caches I/O may address main memory directly

2/13/2012

Er. KAPIL PRASHAR

187

Write through
All writes go to main memory as well as cache Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up to date Lots of traffic Slows down writes

2/13/2012

Er. KAPIL PRASHAR

188

Write back
Updates initially made in cache only Update bit for cache slot is set when update occurs If block is to be replaced, write to main memory only if update bit is set Other caches get out of sync I/O must access main memory through cache

2/13/2012

Er. KAPIL PRASHAR

189

Virtual Memory
Virtual memory is a system by which the machine or operating system fools processes running on the machine into thinking that they have a lot more memory to work with than the capacity of RAM would indicate. It does this by storing the most recently used items in RAM, and storing the lesser used items in the slower disk memory, and interchanging data between the two whenever a disk access is made. In this way, memory appears to programs to be a full 32 bit address space, when it fact memory space is probably only a mere fraction of that.

2/13/2012

Er. KAPIL PRASHAR

190

Address space and Memory space


An address used by the programmer will be called a virtual address, and the set of such addresses the Address space. An address in main memory is called a physical address. The set of such locations is called the Memory space.

2/13/2012

Er. KAPIL PRASHAR

191

Address mapping using pages

The physical memory is broken down into groups of equal size called blocks, which may range from 64 to 4096 words each. The term page refers to groups of address space of the same size. Portions of programs are moved from auxiliary memory to main memory in records equal to the size of a page.
2/13/2012 Er. KAPIL PRASHAR 192

Continued
Mapping from address space to memory space is facilitated if each virtual address is considered to be represented by two numbers: a page number address and a line within the page. In computers with 2p word per page, p bits are used to specify a line address and the remaining high-orders bits of the virtual address specify the page number.
Page 0 Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Block 0 Block 1 Block 2 Block 3

Address space N=8K=213


2/13/2012

Memory space M=4K=212


Er. KAPIL PRASHAR 193

Continued
In the example, a virtual address has 13 bits. Let the page contains 1K of words, since 1K= 210 = 1024 words, the high order three bits of a virtual address will specify one of the eight pages and the low order 10 bits give the line address within the page. The only mapping required here is from a page number to a block number.

2/13/2012

Er. KAPIL PRASHAR

194

Continued
The organization of the memory mapping table in a paged system is shown in figure below:

2/13/2012

Er. KAPIL PRASHAR

195

Page Replacement
Page replacement is required on page fault. Algorithms are: FIFO, LRU.

2/13/2012

Er. KAPIL PRASHAR

196

You might also like