Unit 1
Unit 1
Unit 1
mahes@nitt.edu
Time Table
Slot: C1
Credits: 3
Venue: G9, Orion Day Time
Function and structure of a computer
Functional components of a computer
Interconnection of components
Performance of a computer
Syllabus
Unit 2: Representation of Instructions
Machine instructions
Memory locations & Addresses, Operands
Addressing modes
Instruction formats, Instruction sets
Instruction set architectures - CISC and RISC architectures
Superscalar Architectures
Fixed point and floating point operations
Syllabus
Unit 3: Basic Processing Unit
Fundamental concepts
ALU, Control unit
Multiple bus organization
Hardwired control, Micro programmed control
Pipelining
Data hazards, Instruction hazards
Influence on instruction sets
Data path and control considerations, Performance considerations
Syllabus
Unit 4: Memory organization
Basic concepts: Semiconductor RAM memories, ROM, Speed - Size and cost
Memory Interfacing circuits
Cache memory, Improving cache performance
Memory management unit
Shared/Distributed Memory
Cache coherency in multiprocessor
Segmentation, Paging
Concept of virtual memory, Address translation
Secondary storage devices.
Syllabus
Unit 5: I/O Organization
Accessing I/O devices
Input/output programming
Interrupts, Exception Handling
Direct Memory Access
Buses, I/O interfaces-
Serial port, Parallel port
PCI bus, SCSI bus, USB bus
Firewall and Infinity band
I/O peripherals.
Function of a computer:
A computer is a complex system.
The hierarchical nature of most complex systems is the key to describe them.
A hierarchical system is a set of interrelated subsystems.
The subsytems are hierarchical in structure until we reach some lowest level of
elementary subsystem.
The hierarchical nature is essential to both design and description.
The designer need only deal with a particular level of the system.
At each level, the system consists of a set of components and their
interrelationships.
Function of a computer:
The behavior at each level depends only on a simplified, abstracted
characterization of the system at the next lower level.
At each level, the designer is concerned with structure and function.
Structure: The way in which the components are interrelated
Function: The operation of each individual component as part of the structure
Description of the computer system:
Top-down: Begin description with a top view and decomposing the system into
its subparts. This is clearest and most effective.
Bottom-up: Starting at the bottom and building up to a complete description.
Function of a computer:
The basic functions that a computer can perform is given in
Fig.1.1.
There are only four functions:
Data processing
Data storage
Data movement
Control
The computer is to process data.
The computer may processing data on the fly.
The computer must temporarily store at least those pieces of
data that are being processed. A short-term data storage
Fig. 1.1 A Functional View of the
function is needed. Computer
Function of a computer:
The computer may perform a long-term data storage function
also.
Files of data are stored on the computer for subsequent
retrieval and update.
The computer must be able to move data between itself and the
outside world.
Input–Output (I/O): Process when data are received from or
delivered to a device that is directly connected to the computer.
I/O devices are called peripherals.
When data are moved over longer distances, to or from a remote
device, the process is known as data communications. Fig. 1.1 A Functional View of the
Computer
Function of a computer:
There must be control of store, process, and move functions.
This control is exercised by the individual(s) who provides the
computer with instructions.
A control unit manages the computer’s resources and
orchestrates the performance of its functional parts in response
to those instructions.
Figure depicts four possible operations.
The simplest possible depiction of a computer.
The computer interacts with its external environment through:
Peripheral devices or communication lines.
There may be one or more of each of the aforementioned
components.
Traditionally, there has been just a single processor.
Multiple processors in a single computer are used in recent
times.
The most complex component is the CPU. Its major
structural components are:
Information in a computer: Data or Instructions.
Instructions command:
Transfer of information within a computer or between computers.
Specify the arithmetic and logic operations to be performed
Program: A list of instructions which performs a task.
Processor fetches the program instructions from the memory and performs the desired operations.
Data: Numbers and characters that are used as operands by the instructions.
Instruction, number, or character is encoded as a string of binary digits called bits.
Functional components of a computer:
Input Unit:
Computers accept coded information through input units.
Some of the input devices are:
Keyboard
Touchpad, Mouse, Joystick, and Trackball.
Microphones
Cameras
Memory Unit:
Stores programs and data.
Two classes of storage: primary and secondary.
Functional components of a computer: Memory unit
Primary Memory: (aka main memory)
A fast memory that operates at electronic speeds.
Programs must be stored in this memory while they are being executed.
Consists of semiconductor storage cells.
Handled in groups of fixed size called words.
Number of bits in each word is word length: 16, 32, 64 bits.
A distinct address is associated with each word location.
Addresses are consecutive numbers, starting from 0.
A particular word is accessed by:
Specifying its address.
Issuing a control command to the memory.
Random access memory: A memory in which any location can be accessed in a short and fixed
amount of time with address.
Memory access time: The time required to access one word. 1 to 100 ns for RAMs.
Functional components of a computer: Memory unit
Cache Memory:
Used to hold sections of a program that are currently being executed.
Associated data are also stored.
The cache is tightly coupled with the processor.
Usually contained on the same integrated-circuit chip.
The cache is to facilitate high instruction execution rates.
Instructions are fetched into the processor and copied in cache.
Data from main memory when needed by instructions are copied in cache.
Repeatedly used instructions and data are directly fetched from cache.
Functional components of a computer: Memory unit
Secondary Storage:
Primary memory is expensive, does not retain information when power is turned off.
Secondary memory is less expensive, permanent.
Used when large amounts of data and many programs have to be stored.
Data that are accessed infrequently.
Access times for secondary storage are longer than for primary memory.
Secondary storage devices:
Magnetic disks.
Optical disks (CD, DVD).
Flash memory.
Functional components of a computer: ALU
Computer operations are executed in the arithmetic and logic unit (ALU) of the processor.
Arithmetic or logic operation: addition, subtraction, multiplication, division, or comparison of numbers.
Operations are performed by bringing operands into the processor’s ALU.
Two number to be added are:
Brought into the processor from memory.
The addition is carried out by the ALU.
The sum may then be stored in the memory or retained in the processor for immediate use.
Operands brought into the processor are stored in high-speed storage elements called registers.
Each register can store one word of data.
Access times to registers are even shorter than access times to the cache unit on the processor chip.
Functional components of a computer: Control Unit
The memory, arithmetic and logic, and I/O units store and process information.
Perform input and output operations.
Their operations needs to be coordinated.
The control unit sends control signals to other units and senses their states.
I/O transfers, are controlled by program instructions.
The identify the devices involved and the information to be transferred.
Control circuits are responsible for generating the timing signals that:
Govern the transfers.
Determine when a given action is to take place.
The control circuitry is physically distributed throughout the computer.
Functional components of a computer:
Configuration of logic components designed for a specific
computation is constructed.
Connecting the various components in the desired
configuration is a form of ‘hardwired’ programming.
With general-purpose hardware, the system accepts data and
control signals.
Only new set of control signals needed instead of rewiring the
hardware for each new program.
Data and instructions must be put into the system.
Input modules are needed.
Output module serves the results.
Functional components of a computer:
I/O devices are sequential.
Programs may require non-sequential execution.
Memory is used for non-sequential data access and
instruction execution.
Von Neumann architecture saves both instructions
and data in same memory.
MAR: specifies the address in memory for the next
read or write.
MBR: contains the data to be written into memory
or receives the data read from memory.
I/O AR: specifies a particular I/O device.
I/O BR: used for the exchange of data between an
I/O module and the CPU.
Computer function: In detail
Computer executes instructions.
Instruction processing consists of two steps:
The processor reads (fetches) instructions from memory one at a time (fetch cycle).
Executes each instruction (execute cycle).
Program execution consists of repeating the process of instruction fetch and instruction execution.
The processing required for a single instruction is called an instruction cycle.
Program execution halts only if:
The machine is turned off.
Unrecoverable error occurs.
Halt signal is encountered.
The oac state appears twice: an instruction may involve a read, a write, or both.
The diagram allows for multiple operands and multiple results (some machines require this).
The PDP-11 instruction ADD A,B results in the following sequence of states:
iac, if, iod, oac, of, oac, of, do, oac, os.
On some machines, a single instruction can specify an operation to be performed on a vector:
One-dimensional array of numbers
A string (one-dimensional array) of characters
This would involve repetitive operand fetch and/or store operations.
Interrupts:
This is a mechanism by which other modules (I/O, memory) may interrupt the normal processing of the
processor.
The most common classes of interrupts are shown in the table.
Interrupts are provided primarily as a way to improve processing efficiency.
Most external devices are much slower than the processor.
Suppose the processor is transferring data to a printer using the instruction cycle scheme of Fig.3.3.
After each write operation, the processor must pause and remain idle until the printer catches up.
The length of this pause may be on the order of many hundreds or even thousands of instruction
cycles that do not involve memory.
This is a very wasteful use of the processor.
Interrupts:
The most common classes of interrupts are shown in the table.
Interrupts:
The user program performs a series of WRITE calls interleaved with
processing.
Code segments 1, 2, and 3 refer to sequences of instructions that do not
involve I/O.
The WRITE calls are to an I/O program that is a system utility.
System utility will perform the actual I/O operation.
The I/O program consists of three sections:
(a) No Interrupts
Interrupts and instruction cycle:
With interrupts, the processor can be engaged in executing other instructions
while an I/O operation is in progress.
The user program makes a system call in the form of a WRITE call.
The I/O program invoked in this case consists only of the preparation code
and the actual I/O command.
After the few instructions, control returns to the user program.
Meanwhile, the external device is busy accepting data from computer
memory and printing it.
I/O operation is conducted concurrently with the execution of instructions in
the user program.
The processor now proceeds to the fetch cycle and
fetches the first instruction in the interrupt handler program.
This will serve the interrupt.
The interrupt handler program is generally part of
the operating system.
Interrupts and instruction cycle:
Interrupts and instruction cycle:
The user program reaches the second WRITE call.
The I/O operation spawned by the first call is incomplete.
The user program is hung up at that point.
New WRITE call may be processed when the preceding I/O operation is completed.
I/O operation overlaps with the execution of user instructions.
Gain in efficiency.
Interrupts and instruction cycle:
A revised instruction cycle state diagram that includes interrupt cycle processing is shown in Fig. 3.12.
Multiple interrupts:
Multiple interrupts can occur in a system.
A program may be receiving data from a communications line and printing results.
The printer will generate an interrupt every time that it completes a print operation.
The communication line controller will generate an interrupt every time a unit of data arrives.
It is possible for a communications interrupt to occur while a printer interrupt is being processed.
Two approaches to deal with multiple interrupts:
Disable interrupts while an interrupt is being processed. The processor can and will ignore that
interrupt request signal.
If an interrupt occurs, it will remain in pending state. It will be check by processor after interrrupt
enable.
Second approach is to define priorities for interrupts.
Multiple interrupts:
When a user program is executing and an interrupt occurs, interrupts are disabled immediately.
After the interrupt handler routine completes, interrupts are enabled before resuming the user
program.
The processor checks to see if additional interrupts have occurred.
This approach does not take into account relative priority or time-critical needs.
Example:
When input arrives from the communications line,
it may need to be absorbed rapidly to make room for more input.
If the first batch of input has not been processed before
the second batch arrives, data may be lost.
Multiple interrupts:
Second approach is to define priorities for interrupts.
This allows an interrupt of higher priority to cause a lower-priority interrupt handler to be itself
interrupted (Fig. 3.13 (b)).
Consider a system with three I/O devices:
A printer - P2, a disk - P4, a communications line – P5.
Multiple interrupts:
A user program begins at t = 0.
At t = 10, a printer interrupt occurs.
User information is placed on the system stack and execution continues at the printer interrupt
service routine (ISR).
While ISR still executing, at t = 15, a communications interrupt occurs.
This interrupt is honored as comm line has P5.
The printer ISR is interrupted.
Its state is pushed onto the stack.
Execution continues at the communications ISR.
A disk interrupt occurs at t = 20.
This interrupt is held disk interrupt is P4.
The communications ISR runs to completion.
Multiple interrupts:
The communications ISR is complete (t = 25).
The previous processor state (printer ISR) is restored.
Before executing any instruction in printer ISR (P2), processor honors disk interrupt (P4).
Control transfers to the disk ISR.
When disk ISR is complete at t = 35, printer ISR is resumed.
When printer ISR completes at t = 40, control finally returns to the user program.
I/O Function:
The operation of the computer as controlled by the processor is seen.
The interaction between processor and memory is also seen.
An I/O module (e.g., a disk controller) can exchange data directly with the processor.
A processor can read/write into/from memory with address of specific location.
The processor can also read data from or write data to an I/O module.
The processor identifies a specific device that is controlled by a particular I/O module.
I/O instructions rather than memory-referencing instructions are executed in this mode.
It is desirable to allow I/O exchanges to occur directly with memory.
The processor grants to an I/O module the authority to read from or write to memory.
That I/O-memory transfer can occur without tying up the processor.
During such a transfer, the I/O module issues read or write commands to memory.
Relieving the processor of responsibility for the exchange.
This operation is known as direct memory access (DMA).
Interconnection Structures:
A computer consists of a set of components or modules of three basic types:
Processor, memory, I/O.
The set of components communicate with each other.
A computer is a network of basic modules.
There must be paths for connecting the modules.
Interconnection Structure: The collection of paths connecting the various modules.
Fig 3.15 suggests the types of exchanges, the major forms of input and output for each module type is
indicated.
The wide arrows represent multiple signal lines carrying multiple bits of information in parallel.
Each narrow arrows represents a single signal line.
Interconnection Structures:
Memory:
A memory module will consist of N words of equal length.
Each word is assigned a unique numerical address (0, 1, . . . , N – 1).
A word of data can be read from or written into the memory.
The nature of the operation is indicated by read and write control signals.
The location for the operation is specified by an address.
I/O module:
From an internal (to the computer system) point of view, I/O is functionally similar to memory.
There are two operations, read and write.
An I/O module may control more than one external device.
Each of the interfaces to an external device is called as a port.
Each port is given a unique address (e.g., 0, 1, . . . , M – 1).
There are external data paths for the input and output of data with an external device.
An I/O module may be able to send interrupt signals to the processor.
Interconnection Structures:
Processor:
The processor reads in instructions and data.
Writes out data after processing.
Uses control signals to control the overall operation of the system.
It also receives interrupt signals.
The interconnection structure must support the following types of transfers:
Both machines may execute the original high-level language instruction in about the same time.
In this example, the CISC machine is rated at 1 MIPS, the RISC machine would be rated at 4 MIPS.
But both do the same amount of high-level language work in the same amount of time.
The performance of a processor on a program may not be useful in determining its performance on a very
different type of application.
Performance of a computer: Benchmarks
A set of benchmark programs were developed for this purpose.
The same set of programs can be run on different machines and the execution times compared.
The desirable characteristics of a benchmark program are:
It is written in a high-level language, making it portable across different machines.
It is representative of a particular kind of programming style, such as systems programming, numerical
programming, or commercial programming.
It can be measured easily.
It has wide distribution.
A benchmark suite is a collection of programs.
Defined in a high-level language.
These programs together attempt to provide a representative test of a computer in a particular application
or system programming area.
The best known collection of benchmark suites is defined and maintained by the System Performance
Evaluation Corporation (SPEC).
Performance of a computer: Benchmarks
The best known collection of benchmark suites is defined and maintained by the System Performance
Evaluation Corporation (SPEC).
SPEC CPU2006 if for processor-intensive applications.
Appropriate for measuring performance for applications that spend most of their time doing computation
rather than I/O.
Consists of 17 floating-point programs written in C, C_x0002__x0002_, and Fortran.
12 integer programs written in C and C_x0002__x0002_++.
Contains over 3 million lines of code.
Older SPEC CPU: SPEC CPU2000, SPEC CPU95, SPEC CPU92, and SPEC CPU89
Other SPEC suites:
Performance of a computer: Averaging Results
Run a number of different benchmark programs on a machine and then average the results.
With m different benchmark program:
Ri is the high-level language instruction execution rate for the ith benchmark program.
An alternative is to take the harmonic mean:
The user is concerned with the execution time of a system, not its execution rate.
Arithmetic mean result is proportional to the sum of the inverses of execution times.
Not inversely proportional to the sum of execution times.
The harmonic mean instruction rate is the inverse of the average execution time.
Performance of a computer: Averaging Results
Two fundamental metrics are of interest from SPEC:
Speed metric: Measures the ability of a computer to complete a single task.
Rate metric: Measures the throughput or rate of a machine carrying out a number of tasks.
SPEC defines a base runtime for each benchmark program using a reference machine.
Results are reported as the ratio of the reference run time to the system run time.
Trefi is the execution time of benchmark program i on the reference system.
Tsuti is the execution time of benchmark program i on the system under test.
Example: Sun Blade 6250
SPEC CPU2006 integer benchmark is 464.h264ref
Tsuti = 934 sec, Trefi = 22136 sec ==> ratio is: 22136/934 _x0003_= 23.7
Overall performance measure for the system under test is average values for the ratios for all 12 integer
benchmarks.
Performance of a computer: Averaging Results
Overall performance measure for the system under test is average values for the ratios for all 12 integer
benchmarks.
SPEC specifies the use of a geometric mean:
ri is the ratio for the ith benchmark program.
For the Sun Blade 6250: