0% found this document useful (0 votes)
84 views73 pages

Topic 12-Computer Evolution and Performance Issues

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 73

Topic 1 & 2:

+ Basic Concepts, Computer


Evolution and performance
Issues
CSC1104
+ Learning Objectives
After studying this chapter, you should be able to:

 Explain the general functions and structure of a digital computer.

 Present an overview of the evolution of computer technology from


early digital computers to the latest microprocessors.

 Present an overview of the evolution of the x86 architecture.

 Define embedded systems

 Understand the key performance issues that relate to computer


design.

 Explain the reasons for the move to multicore organization, and


understand the trade-off between cache and processor resources on a
single chip.

 Distinguish among multicore, MIC, and GPGPU organizations.

CSC1104
+ COMPUTER ARCHITECTURE
 The structure and behaviour of the various
functional modules of the computer and how
they interact to provide the processing needs
of the user.
 Architectural attributes include:
 Instruction set of the computer
 Number of bits used to represent
various data types
 I/O mechanisms
 Addressing Techniques
 etc.
+ Computer Organisation
 The way the hardware components are
connected together to form a computer
system.

 Organisational attributes include hardware


details visible to the user e.g. the interfaces,
memory technology.

N.B. A number of manufacturers offer many


different computer models (organizations) but all
having the same architecture and thus differing in
costs.
Computer Architecture
Computer Organization
• Attributes of a • Instruction set, number of bits
system visible to the used to represent various data
programmer types, I/O mechanisms,
• Have a direct impact techniques for addressing
on the logical memory
execution of a
program

Architectural
Computer
attributes
Architecture
include:

Organizational
Computer
• Hardware details transparent to attributes
Organization
the programmer, control include:
signals, interfaces between the
computer and peripherals, • The operational units and
memory technology used
their interconnections that
realize the architectural
specifications.

CSC1104
+
IBM System
370 Architecture
 Many computer manufacturers offer a family of computer models,
all with the same architecture but with differences in organization.

 Consequently, the different models in the family have different


price and performance characteristics.

 A particular architecture may span many years and encompass a


number of different computer models, its organization changing
with changing technology

 IBM System/370 architecture


 Was introduced in 1970
 Included a number of models
 Could upgrade to a more expensive, faster model without having to
abandon original software
 New models are introduced with improved technology, but retain the
same architecture so that the customer’s software investment is
protected
 Architecture has survived to this day as the architecture of IBM’s
mainframe product line.
CSC1104
+
Structure and Function
 Hierarchical system
 Set of interrelated subsystems  Structure
 The way in which components
 Hierarchical nature of complex relate to each other.
systems is essential to both their
design and their description  Function
 The operation of individual
 Designer need only deal with a
components as part of the structure
particular level of the system at a
time.
 Concerned with structure and
function at each level.
 At each level, the system consists of
a set of components and their
interrelationships.

CSC1104
+
Function
 There are four basic functions that a computer can perform:
 Data processing
 Data may take a wide variety of forms and the range of processing
requirements is broad
 Data storage
 Short-term
 Long-term
 Data movement
 The computer’s operating environment consists of devices that serve as either sources or
destinations of data
 Input-output (I/O) - when data are received from or delivered to a device
(peripheral) that is directly connected to the computer
 Data communications – when data are moved over longer distances, to or from
a remote device.
 Control
 A control unit manages the computer’s resources and coordinates the
performance of its functional parts in response to instructions.

CSC1104
COMPUTER

I/O Main
memory

System
Bus

CPU

CPU

Registers ALU

Structure Internal
Bus

Control
Unit

CONTROL
UNIT
Sequencing
Logic

Control Unit
Registers and
Decoders

Control
Memory

Figure 1.1 A Top-Down View of a Computer

CSC1104
+
 CPU – controls the operation of
the computer and performs its
There are four data processing functions
main structural
components  Main Memory – stores data
of the computer:  I/O – moves data between the
computer and its external
environment

 System Interconnection –
some mechanism that provides
for communication among CPU,
main memory, and I/O

CSC1104
+  Control Unit
CPU
 Controls the operation of the CPU
and hence the computer
Major structural
 Arithmetic and Logic Unit (ALU)
components:
 Performs the computer’s data
processing function

 Registers
 Provide storage internal to the CPU

 CPU Interconnection
 Some mechanism that provides for
communication among the control
unit, ALU, and registers

CSC1104
+
Multicore Computer Structure
 Contemporary computers generally have multiple processors.

 When these processors all reside on a single chip, the term multicore
computer is used, and each processing unit is called a core .

 Central processing unit (CPU)


 Portion of the computer that fetches and executes instructions
 Consists of an ALU, a control unit, and registers
 Referred to as a processor in a system with a single processing unit

 Core
 An individual processing unit on a processor chip.
 May be equivalent in functionality to a CPU on a single-CPU system.
 Specialized processing units are also referred to as cores.

 Processor
 A physical piece of silicon containing one or more cores.
 Is the computer component that interprets and executes instructions.
 Referred to as a multicore processor if it contains multiple cores.
CSC1104
+
Cache Memory
 Multiple layers of memory between the processor and main
memory.

 Is smaller and faster than main memory.

 Used to speed up memory access by placing in the cache data


from main memory that is likely to be used in the near future.

 A greater performance improvement may be obtained by


using multiple levels of cache, with level 1 (L1) closest to the
core and additional levels (L2, L3, etc.) progressively farther
from the core.

CSC1104
MOTHERBOARD
Main memory chips

Processor
I/O chips chip

PROCESSOR CHIP

Core Core Core Core

L3 cache L3 cache

Core Core Core Core

CORE
Arithmetic
Instruction and logic Load/
logic unit (ALU) store logic

L1 I-cache L1 data cache

L2 instruction L2 data
cache cache

Figure 1.2 Simplified View of Major Elements of a Multicore Computer

CSC1104
+

Figure 1.3
Motherboard with Two Intel Quad-Core Xeon Processors

CSC1104
+
History of Computers
First Generation: Vacuum Tubes

 Used vacuum tubes for digital logic elements

and memory.
 IAS computer
 Most famous first generational computer.
 Fundamental design approach first implemented in the IAS computer
was the stored program concept .
 Is a Prototype of all subsequent general-purpose computers.

CSC1104
Central processing unit (CPU)

Arithmetic-logic unit (CA)

AC MQ

Input-
Arithmetic-logic output
circuits
equipment
(I, O)

MBR

Instructions
and data

Instructions
and data
M(0)
M(1)
M(2)
M(3) PC IBR
M(4) AC: Accumulator register
MQ: multiply-quotient register
MBR: memory buffer register
IBR: instruction buffer register
MAR IR PC: program counter
MAR: memory address register
Main
IR: insruction register
memory
(M)
Control
Control
circuits
signals
M(4092)
M(4093)
M(4095)
Program control unit (CC)

Addresses

CSC1104 Figure 1.6 IAS Structure


+ Registers
Memory buffer register • Contains a word to be stored in memory or sent to the I/O unit
(MBR) • Or is used to receive a word from memory or from the I/O unit

Memory address • Specifies the address in memory of the word to be written


register (MAR) from or read into the MBR

Instruction register (IR) • Contains the 8-bit opcode instruction being executed

Instruction buffer • Employed to temporarily hold the right-hand instruction from


register (IBR) a word in memory

• Contains the address of the next instruction pair to be fetched


Program counter (PC) from memory

Accumulator (AC) and • Employed to temporarily hold operands and results of ALU
multiplier quotient (MQ) operations
CSC1104
0 1 39

sign bit (a) Number word

left instruction (20 bits) right instruction (20 bits)

0 8 20 28 39

opcode (8 bits) address (12 bits) opcode (8 bits) address (12 bits)

(b) Instruction word

Figure 1.7 IAS Memory Formats

CSC1104
+ IAS operation
 The IAS operates by repetitively performing an instruction
cycle, as shown in Figure 1.8.
 Each instruction cycle consists of two sub-cycles.
 During the fetch cycle, the opcode of the next instruction is
loaded into the IR and the address portion is loaded into the
MAR.
 This instruction may be taken from the IBR, or it can be
obtained from memory by loading a word into the MBR, and
then down to the IBR, IR, and MAR.
 Once the opcode is in the IR, the execute cycle is performed.
 Control circuitry interprets the opcode and executes the
instruction by sending out the appropriate control signals to
cause data to be moved or an operation to be performed by the
ALU.
CSC1104
Start

Yes Is next No
instruction MAR PC
No memory in IBR?
Fetch access
cycle required
MBR M(MAR)

Left
No Yes IBR MBR (20:39)
IR IBR (0:7) IR MBR (20:27) instruction
IR MBR (0:7)
MAR IBR (8:19) MAR MBR (28:39) required?
MAR MBR (8:19)

PC PC + 1

Decode instruction in IR

AC M(X) Go to M(X, 0:19) If AC > 0 then AC AC + M(X)


go to M(X, 0:19)

Execution Yes
Is AC > 0?
cycle

MBR M(MAR) PC MAR No MBR M(MAR)

AC MBR AC AC + MBR

M(X) = contents of memory location whose addr ess is X


(i:j) = bits i through j

Figure 1.8 Partial Flowchart of IAS Operation


CSC1104
Symbolic
Instruction Type Opcode Representation Description
00001010 LOAD MQ Transfer contents of register MQ to the
accumulator AC
00001001 LOAD MQ,M(X) Transfer contents of memory location X to
MQ
00100001 STOR M(X) Transfer contents of accumulator to memory
Data transfer location X
00000001 LOAD M(X) Transfer M(X) to the accumulator
00000010 LOAD –M(X) Transfer –M(X) to the accumulator
00000011 LOAD |M(X)| Transfer absolute value of M(X) to the
accumulator
00000100 LOAD –|M(X)| Transfer –|M(X)| to the accumulator
Unconditional 00001101 JUMP M(X,0:19) Take next instruction from left half of M(X)

Table 1.1
branch 00001110 JUMP M(X,20:39) Take next instruction from right half of M(X)
00001111 JUMP+ M(X,0:19) If number in the accumulator is nonnegative,
take next instruction from left half of M(X)
0 JU If number in the
0 MP accumulator is nonnegative,
Conditional branch 0 + take next instruction from

The IAS
1 M(X right half of M(X)
0 ,20:
0 39)

Instruction Set
0
0
00000101 ADD M(X) Add M(X) to AC; put the result in AC
00000111 ADD |M(X)| Add |M(X)| to AC; put the result in AC
00000110 SUB M(X) Subtract M(X) from AC; put the result in AC
00001000 SUB |M(X)| Subtract |M(X)| from AC; put the remainder
in AC
00001011 MUL M(X) Multiply M(X) by MQ; put most significant
bits of result in AC, put least significant bits
Arithmetic
in MQ
00001100 DIV M(X) Divide AC by M(X); put the quotient in MQ
and the remainder in AC
00010100 LSH Multiply accumulator by 2; i.e., shift left one
bit position
00010101 RSH Divide accumulator by 2; i.e., shift right one
position
00010010 STOR M(X,8:19) Replace left address field at M(X) by 12
rightmost bits of AC
Address modify
00010011 STOR M(X,28:39) Replace right address field at M(X) by 12
rightmost bits of AC (Table can be found on page 17 in the textbook.)

CSC1104
+
History of Computers
Second Generation: Transistors
 Smaller

 Cheaper

 Dissipates less heat than a vacuum tube

 Is a solid state device made from silicon

 Was invented at Bell Labs in 1947

 It was not until the late 1950’s that fully transistorized


computers were commercially available

CSC1104
+
Second Generation Computers

 Introduced:
 More complex arithmetic and logic units and
control units
 The use of high-level programming languages
 Provision of system software which provided the
ability to:
 Load programs
 Move data to peripherals
 Libraries perform common computations.

CSC1104
+
Table 1.2
Computer Generations

Approximate Typical Speed


Generation Dates Technology (operations per second)
1 1946–1957 Vacuum tube 40,000
2 1957–1964 Transistor 200,000
3 1965–1971 Small and medium scale 1,000,000
integration
4 1972–1977 Large scale integration 10,000,000
5 1978–1991 Very large scale integration 100,000,000
6 1991- Ultra large scale integration >1,000,000,000

CSC1104
History of Computers
Third Generation: Integrated Circuits
 Early second-generation computers contained about 10,000 transistors.

 This figure grew to the hundreds of thousands, making the manufacture


of newer, more powerful machines increasingly difficult.

 The entire manufacturing process, from transistor to circuit board, was


expensive and cumbersome.

 These facts of life were beginning to create problems in the computer


industry.

 It is the integrated circuit that defines the third generation of


computers.

 The two most important members of the third generation were the IBM
System/360 and the DEC PDP-8.
CSC1104
+Integrated circuits:
 Thebasic elements of a digital computer, must perform storage,
movement, processing, and control functions.
 Only two fundamental types of components are required: gates
and memory cells.
 A gate is a device that implements a simple Boolean or logical
function, such as IF A AND B ARE TRUE THEN C IS TRUE
(AND gate).
 Suchdevices are called gates because they control data flow in
much the same way that canal gates control the flow of water.
 The memory cell is a device that can store one bit of data; that is,
the device can be in one of two stable states at any time.
 By interconnecting large numbers of these fundamental devices,
we can construct a computer.

CSC1104
Boolean Binary
Input logic Output Input storage Output
function cell

Read

Activate Write
signal

(a) Gate (b) Memory cell

Figure 1.10 Fundamental Computer Elements

CSC1104
+
Integrated  For example, a gate will have one
or two data inputs plus a control
signal input that activates the gate.
Circuits  When the control signal is ON, the
gate performs its function on the
 Data storage – provided by data inputs and produces a data
memory cells. output.

 Similarly, the memory cell will store


 Data processing – provided by the bit that is on its input lead when
gates. the WRITE control signal is ON and
will place the bit that is in the cell
 Data movement – the paths on its output lead when the READ
control signal is ON.
among components are used to
move data from memory to
memory and from memory  A computer consists of gates, memory
through gates to memory. cells, and interconnections among
these elements.
 Control – the paths among
components can carry control  The gates and memory cells are
signals. constructed of simple digital electronic
components.

CSC1104
Wafer

Chip

Gate

Packaged
chip

Figure 1.11 Relationship Among Wafer, Chip, and Gate


CSC1104
+
Moore’s law and Integrated circuits
 Initially, only a few gates or memory cells could be reliably
manufactured and packaged together.

 These early integrated circuits are referred to as small-scale


integration (SSI).

 As time went on, it became possible to pack more and more


components on the same chip.

 Moore's law is the observation that the number of transistors in


an integrated circuit (IC) doubles about every two years.

 Moore observed that the number of transistors that could be put on


a single chip was doubling every year, and correctly predicted that
this pace would continue into the near future.

CSC1104
t
ui
g

ed of
rc
or in

ga w
d
st rk

ci

ul l a
at n

te
gr tio
si o

’s
an w

om e
te n

r
tr irst

in ve

p r oo
In

M
F

100 bn
10 bn
1 bn
100 m
10 m
100,000
10.000
1,000
100
10
1
1947 50 55 60 65 70 75 80 85 90 95 2000 05 11

Figure 1.12 Growth in Transistor Count on Integrated Circuits


(DRAM memory)

CSC1104
Moore’s Law
1965; Gordon Moore – co-founder of Intel

Observed number of transistors that could be


put on a single chip was doubling every year

Consequences of Moore’s law:


The pace slowed to a
doubling every 18
months in the 1970’s
but has sustained The cost of Computer becomes
The electrical path With more circuitry
that rate ever since computer logic and
length is shortened, smaller and is more
on each chip, there
memory circuitry convenient to use in a
increasing variety of are fewer interchip
has fallen at a
operating speed. environments connections
dramatic rate

CSC1104
+
IBM System/360
 Announced in 1964 was the system/360, a new family of
computer products.

 Product line was incompatible with older IBM machines

 Was the success of the decade and cemented IBM as the


overwhelmingly dominant computer vendor

 The architecture remains to this day the architecture of IBM’s


mainframe computers

 Was the industry’s first planned family of computers


 Models were compatible in the sense that a program written for
one model should be capable of being executed by another
model in the series

CSC1104
+ Family Characteristics
Similar or
Similar or
identical
identical
operating
instruction set
system

Increasing
Increasing
number of I/O
speed
ports

Increasing
Increasing cost
memory size

CSC1104
Console Main I/O I/O
CPU
controller memory module module

Omnibus

Figure 1.13 PDP-8 Bus Structure

CSC1104
+
Later Generations
 Beyond the third generation there is less general agreement on
defining generations of computers.

 Later generations, are based on advances in integrated circuit


technology.

 With the introduction of large-scale integration (LSI), more than


1,000 components can be placed on a single integrated circuit
chip.

 Very-large-scale integration (VLSI) achieved more than 10,000


components per chip, while current ultra large-scale integration
(ULSI) chips can contain more than one billion components.

 With the rapid pace of technology, the high rate of introduction of


new products, and the importance of software and communications
as well as hardware, the classification by generation becomes less
clear and less meaningful.
CSC1104
+ LSI
Large
Scale
Later Integration

Generations
VLSI
Very Large
Scale
Integration

ULSI
Semiconductor Memory Ultra Large
Microprocessors Scale
Integration

CSC1104
Semiconductor Memory
In 1970 Fairchild produced the first relatively capacious semiconductor memory

Chip was about the size Could hold 256 bits of


Non-destructive Much faster than core
of a single core memory

In 1974 the price per bit of semiconductor memory dropped below the price per bit
of core memory
There has been a continuing and rapid decline in Developments in memory and processor
memory cost accompanied by a corresponding technologies changed the nature of computers in
increase in physical memory density less than a decade

Since 1970 semiconductor memory has been through 13 generations

Each generation has provided four times the storage density of the previous generation, accompanied
by declining cost per bit and declining access time

CSC1104
+
Microprocessors
 The density of elements on processor chips continued to rise
 More and more elements were placed on each chip so that fewer
and fewer chips were needed to construct a single computer
processor

 1971 Intel developed 4004


 First chip to contain all of the components of a CPU on a single
chip
 Birth of microprocessor

 1972 Intel developed 8008


 First 8-bit microprocessor

 1974 Intel developed 8080


 First general purpose microprocessor
 Faster, has a richer instruction set, has a large addressing
capability
CSC1104
Evolution of Intel Microprocessors

4004 8008 8080 8086 8088


Introduced 1971 1972 1974 1978 1979
5 MHz, 8 MHz, 10
Clock speeds 108 kHz 108 kHz 2 MHz 5 MHz, 8 MHz
MHz
Bus width 4 bits 8 bits 8 bits 16 bits 8 bits
Number of
2,300 3,500 6,000 29,000 29,000
transistors
Feature size
10 8 6 3 6
(µm)
Addressable 640 Bytes 16 KB 64 KB 1 MB 1 MB
memory

(a) 1970s Processors


CSC1104
Evolution of Intel Microprocessors
80286 386TM DX 386TM SX 486TM DX
CPU
Introduced 1982 1985 1988 1989
Clock speeds 6 MHz - 12.5 16 MHz - 33 16 MHz - 33 25 MHz - 50
MHz MHz MHz MHz
Bus width 16 bits 32 bits 16 bits 32 bits
Number of transistors
134,000 275,000 275,000 1.2 million

Feature size (µm) 1.5 1 1 0.8 - 1


Addressable
16 MB 4 GB 16 MB 4 GB
memory
Virtual
1 GB 64 TB 64 TB 64 TB
memory
Cache — — — 8 kB

(b) 1980s Processors


CSC1104
Evolution of Intel Microprocessors

486TM SX Pentium Pentium Pro Pentium II


Introduced 1991 1993 1995 1997
Clock speeds 16 MHz - 33 60 MHz - 166 150 MHz - 200 200 MHz - 300
MHz MHz, MHz MHz
Bus width 32 bits 32 bits 64 bits 64 bits
Number of 1.185 million 3.1 million 5.5 million 7.5 million
transistors
Feature size (µm) 1 0.8 0.6 0.35
Addressable
4 GB 4 GB 64 GB 64 GB
memory
Virtual memory 64 TB 64 TB 64 TB 64 TB
512 kB L1 and 1
Cache 8 kB 8 kB 512 kB L2
MB L2

(c) 1990s Processors


CSC1104
Evolution of Intel Microprocessors
Core 2 Duo Core i7 EE
Pentium III Pentium 4
4960X
Introduced 1999 2000 2006 2013
Clock speeds 450 - 660 MHz 1.3 - 1.8 GHz 1.06 - 1.2 GHz 4 GHz
Bus
wid 64 bits 64 bits 64 bits 64 bits
th
Number of 9.5 million 42 million 167 million 1.86 billion
transistors
Feature size (nm) 250 180 65 22
Addressable
64 GB 64 GB 64 GB 64 GB
memory
Virtual memory 64 TB 64 TB 64 TB 64 TB
Cache 512 kB L2 256 kB L2 2 MB L2 1.5 MB L2/15
MB L3
Number of cores 1 1 2 6

(d) Recent Processors


CSC1104
+
The Evolution of the Intel x86
Architecture
 Two processor families are the Intel x86 and the ARM
architectures

 Current x86 offerings represent the results of decades of


design effort on complex instruction set computers (CISCs)

 An alternative approach to processor design is the reduced


instruction set computer (RISC)

 ARM architecture is used in a wide variety of embedded


systems and is one of the most powerful and best-designed
RISC-based systems on the market

CSC1104
Highlights of the Evolution of the
Intel Product Line:
8080 8086 80286 80386 80486
• World’s first • A more • Extension of the • Intel’s first 32- • Introduced the
general- powerful 16-bit 8086 enabling bit machine use of much
purpose machine addressing a • First Intel more
microprocessor • Has an 16-MB memory processor to sophisticated
• 8-bit machine, instruction instead of just support and powerful
8-bit data path cache, or 1MB multitasking cache
to memory queue, that technology and
• Was used in the prefetches a sophisticated
first personal few instructions instruction
computer before they are pipelining
(Altair) executed • Also offered a
• The first built-in math
appearance of coprocessor
the x86
architecture
• The 8088 was a
variant of this
processor and
used in IBM’s
first personal
computer
(securing the
success of Intel

CSC1104
Highlights of the Evolution of the
Intel Product Line:
Pentium
• Intel introduced the use of superscalar techniques, which allow multiple instructions to execute in parallel

Pentium Pro
• Continued the move into superscalar organization with aggressive use of register renaming, branch
prediction, data flow analysis, and speculative execution

Pentium II
• Incorporated Intel MMX technology, which is designed specifically to process video, audio, and graphics
data efficiently

Pentium III
•Incorporated additional floating-point instructions
•Streaming SIMD Extensions (SSE)

Pentium 4
• Includes additional floating-point and other enhancements for multimedia

Core
• First Intel x86 micro-core

Core 2
• Extends the Core architecture to 64 bits
• Core 2 Quad provides four cores on a single chip
• More recent Core offerings have up to 10 cores per chip
• An important addition to the architecture was the Advanced Vector Extensions instruction set

CSC1104
+
Embedded Systems
 The use of electronics and software within a product as
opposed to a general-purpose computer, such as laptop or
desktop system.

 Billions of computer systems are produced each year that are


embedded within larger devices.

 Today many devices that use electric power have an


embedded computing system

 Often embedded systems are tightly coupled to their


environment
 This can give rise to real-time constraints imposed by the need to
interact with the environment
 Constraints such as required speeds of motion, required precision
of measurement, and required time durations, dictate the timing of
software operations
 If multiple activities must be managed simultaneously this imposes
more complex real-time constraints

CSC1104
Custom
logic

Processor Memory

Human Diagnostic
interface port

A/D D/A
conversion Conversion

Actuators/
Sensors
indicators

Figure 1.14 Possible Organization of an Embedded System

CSC1104
+ The Internet of Things (IoT)
 Term that refers to the expanding interconnection of smart devices, ranging
from appliances to tiny sensors.

 Is primarily driven by deeply embedded devices

 Generations of deployment culminating in the IoT:


 Information technology (IT)
 PCs, servers, routers, firewalls, and so on, bought as IT devices by enterprise IT
people and primarily using wired connectivity.
 Operational technology (OT)
 Machines/appliances with embedded IT built by non-IT companies, such as medical
machinery, SCADA, process control, and kiosks, bought as appliances by enterprise
OT people and primarily using wired connectivity
 Personal technology
 Smartphones, tablets, and eBook readers bought as IT devices by consumers
exclusively using wireless connectivity and often multiple forms of wireless
connectivity
 Sensor/actuator technology
 Single-purpose devices bought by consumers, IT, and OT people exclusively using
wireless connectivity, generally of a single form, as part of larger systems

 It is the fourth generation that is usually thought of as the IoT and it is marked
by the use of billions of embedded devices.
CSC1104
+
Embedded Application Processors
Operating versus
Systems Dedicated Processors

 There are two general  Application processors


approaches to developing an  Defined by the processor’s ability
to execute complex operating
embedded operating system systems
(OS):  General-purpose in nature
 An example is the smartphone –
 Take an existing OS and adapt the embedded system is designed
to support numerous apps and
it for the embedded perform a wide variety of functions
application. For example, there
are embedded versions of  Dedicated processor
Linux, Windows, and Mac.  Is dedicated to one or a small
number of specific tasks required
by the host device
 Design and implement an OS  Because such an embedded system
intended solely for embedded is dedicated to a specific task or
tasks, the processor and associated
use. An example of the latter is components can be engineered to
TinyOS, widely used in wireless reduce size and cost
sensor networks.
CSC1104
+
Cloud Computing
 NIST defines cloud computing as:

 “A model for enabling ubiquitous, convenient, on-


demand network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and services)
that can be rapidly provisioned and released with minimal management
effort or service provider interaction.”

 You get economies of scale, professional network management, and


professional security management

 The individual or company only needs to pay for the storage capacity and
services they need.

 The user, be it company or individual, doesn’t have the hassle of setting


up a database system, acquiring the hardware they need, doing
maintenance, and backing up the data - All these are part of the cloud
service.

 Cloud provider takes care of security.


CSC1104
Cloud Networking
 Refers to the networks and network management functionality that must
be in place to enable cloud computing

 One example is the provisioning of high-performance and/or high-


reliability networking between the provider and subscriber

 The collection of network capabilities required to access a cloud,


including making use of specialized services over the Internet, linking
enterprise data center to a cloud, and using firewalls and other network
security devices at critical points to enforce access security policies

Cloud Storage
 Subset of cloud computing

 Consists of database storage and database applications hosted remotely


on cloud servers

 Enables small businesses and individual users to take advantage of data


storage that scales with their needs and to take advantage of a variety of
database applications without having to buy, maintain, and manage the
storage assets
CSC1104
+Cloud Services
 Virtually all cloud service is provided using one of three models SaaS, PaaS,
and IaaS.

1. Software as a service (SaaS).

 SaaS cloud provides service to customers in the form of software, specifically


application software, running on and accessible in the cloud.

 I It enables the customer to use the cloud provider’s applications running on


the provider’s cloud infrastructure.

 The applications are accessible from various client devices through a simple
interface such as a Web browser.

 Instead of obtaining desktop and server licenses for software products it


uses, an enterprise obtains the same functions from the cloud service.

 SaaS saves the complexity of software installation, maintenance and


upgrades.

 Examples of services at this level are Gmail, Google’s e-mail service, and
Salesforce.com, which help firms keep track of their customers.

 Typically, subscribers use specific applications on demand.


CSC1104
+Cloud Services
2. Platform as a service (PaaS)

 A PaaS cloud provides service to customers in the form of a platform on


which the customer’s applications can run.

 PaaS enables the customer to deploy onto the cloud infrastructure


containing customer created or acquired applications.

 A PaaS cloud provides useful software building blocks, plus a number of


development tools, such as programming languages, run- time
environments, and other tools that assist in deploying new applications.

 In effect, PaaS is an operating system in the cloud.

 PaaS is useful for an organization that wants to develop new or tailored


applications while paying for the needed computing resources only as
needed and only for as long as needed.

 Google App Engine and the Salesforce1 Platform from Salesforce.com


are examples of PaaS.
CSC1104
+Cloud Services
3. Infrastructure as a service (IaaS)

 With IaaS, the customer has access to the underlying cloud


infrastructure. IaaS provides virtual machines and other abstracted
hardware and operating systems, which may be controlled through a
service application programming interface (API).

 IaaS offers the customer processing, storage, networks, and other


fundamental computing resources so that the customer is able to
deploy and run arbitrary software, which can include operating
systems and applications.

 IaaS enables customers to combine basic computing services, such as


number crunching and data storage, to build highly adaptable
computer systems.

 Examples of IaaS are Amazon Elastic Compute Cloud (Amazon EC2)


and Windows Azure.

CSC1104
CSC1104
+ Designing for Performance
 The cost of computer systems continues to drop dramatically, while the performance
and capacity of those systems continue to rise equally dramatically

 Today’s laptops have the computing power of an IBM mainframe from 10 or 15 years
ago.

 Processors are so inexpensive that we now have microprocessors we throw away.

 Desktop applications that require the great power of today’s microprocessor-based


systems include:
 Image processing
 Speech recognition
 Videoconferencing
 Multimedia authoring
 Voice and video annotation of files
 Simulation modeling

 Businesses are relying on increasingly powerful servers to handle transaction and


database processing and to support massive client/server networks that have
replaced the huge mainframe computer centers of yesteryear.

 Cloud service providers use massive high-performance banks of servers to satisfy


high-volume, high-transaction-rate applications for a broad spectrum of clients.
+
Microprocessor Speed
Techniques built into contemporary processors include:
•Pipelining enables a processor to work
simultaneously on multiple instructions
Pipelining by performing a different phase for
each of the multiple instructions at the
same time.
• Processor looks ahead in the
instruction code fetched from
Branch prediction memory and predicts which
branches, or groups of instructions,
are likely to be processed next.

Superscalar • This is the ability to issue more than


one instruction in every processor
execution clock cycle. (In effect, multiple
parallel pipelines are used.)
•Processor analyzes which
instructions are dependent on
Data flow analysis each other’s results, or data, to
create an optimized schedule of
instructions

Speculative • Using branch prediction and data


flow analysis, some processors
execution speculatively execute instructions
ahead of their actual appearance in
the program execution, holding the
results in temporary locations,
keeping execution engines as busy
as possible
+
Performance
Balance
 Adjust the organization and Increase the
number of bits that
architecture to compensate are retrieved at one
time by making
for the mismatch among the DRAMs “wider” and
by using wide bus
capabilities of the various data paths

components Reduce the


frequency of
 Architectural examples memory access by
incorporating
include: increasingly
complex and
efficient cache
structures between
the processor and
main memory
Increase the
Change the DRAM
interconnect
interface to make it
bandwidth between
more efficient by
processors and
including a cache memory by using
or other buffering higher speed buses
scheme on the and a hierarchy of
DRAM chip buses to buffer and
structure data flow
Ethernet modem
(max speed)

Graphics display

Wi-Fi modem
(max speed)

Hard disk

Optical disc

Laser printer

Scanner

Mouse

Keyboard

101 102 103 104 105 106 107 108 109 1010 1011
Data Rate (bps)

Figure 2.1 Typical I/O Device Data Rates


+ Improvements in Chip Organization
and Architecture
Approaches to achieving increased processor
speed:
 Increase hardware speed of processor
 Fundamentally due to shrinking logic gate size
 More gates, packed more tightly, increasing clock rate
 Propagation time for signals reduced, enabling speeding up of a
processor.

 Increase size and speed of caches


 Dedicating part of processor chip
 Cache access times drop significantly

 Change processor organization and architecture that increase


the effective speed of instruction execution.
 Use parallelism in one form or another.
107

106
Transistors (Thousands)
105 Frequency (MHz)
Power (W)
104 Cores

103

102
+
10

0.1
1970 1975 1980 1985 1990 1995 2000 2005 2010

Figure 2.2 Processor Trends


The use of multiple
processors on the same chip
provides the potential to
increase performance

Multicore without increasing the clock


rate.

Strategy is to use two simpler


processors on the chip rather
than one more complex
processor

With two processors larger


caches are justified

As caches became larger it


made performance sense to
create two and then three
levels of cache on a chip
+
Many Integrated Core (MIC)
Graphics Processing Unit (GPU)
MIC GPU

 Chip manufacturers are now in the  Core designed to perform parallel


process of making a huge leap operations on graphics data
forward in the number of cores per
chip, with more than 50 cores per
chip.  Traditionally found on a plug-in
graphics card, it is used to encode
 Leap in performance as well as the and render 2D and 3D graphics as
challenges in developing software well as process video
to exploit such a large number of
cores led to introduction of a new
term: many integrated core  When a broad range of applications
(MIC). are supported by such a processor,
the term general-purpose
 The multicore and MIC strategy computing on GPUs (GPGPU) is
involves a homogeneous collection
of general purpose processors on used.
a single chip
+ Basic Measures of Computer
Performance
1. Clock speed:

 Operations performed by a processor, such as fetching an instruction,


decoding the instruction, performing an arithmetic operation, and so
on, are governed by a system clock.

 Thus, at the most fundamental level, the speed of a processor is


measured in cycles per second, or Hertz(Hz).

 For example ,a 1-GHz processor receives 1 billion pulses per second.

 The rate of pulses is known as the clock rate, or clock speed.

 Some instructions may take only a few cycles, while others require
dozens.

 In addition, when pipelining is used, multiple instructions are being


executed simultaneously.

 Thus, a straight comparison of clock speeds on different processors


does not tell the whole story about performance.
q
cr uar
ys tz
ta
l

an
co di alog
nv git to
er al
sio
n

From Computer Desktop Encyclopedia


1998, The Computer Language Co.

Figure 2.5 System Clock


+ Basic Measures of Computer
Performance
2. Instruction Execution Rate

 A common measure of performance for a processor is the rate at


which instructions are executed in one second, expressed as millions
of instructions per second (MIPS), referred to as the MIPS rate.

 The machine cycle time is the time it takes to fetch and execute one
instruction.

 Another common performance measure deals only with floating-


point instructions.

 These are common in many scientific and game applications.


Floating-point performance is expressed as millions of floating-point
operations per second (MFLOPS). This is the measure of the
arithmetical speed of a processor.
+Try these out!!
For each of the following examples, determine whether this is an
embedded system, explaining why or why not.

a. Are programs that understand physics and/or hardware embedded?


For example, one that uses finite-element methods to predict fluid flow
over airplane wings?

b. b. Is the internal microprocessor controlling a disk drive an example of


an embedded system?

c. I/O drivers control hardware, so does the presence of an I/O driver


imply that the computer executing the driver is embedded?

d. Is a PDA (Personal Digital Assistant) an embedded system?

e. Is the microprocessor controlling a cell phone an embedded system?

f. Is the computer controlling a pacemaker in a person’s chest an


embedded computer?

g. Is the computer controlling fuel injection in an automobile engine


embedded?
CSC1104
+ Weekly Assignment 1: (Individual)
 Questions:
1. Briefly explain the history of development of computers
starting from first generation up to the latest developments?
2. Briefly explain the evolution of the Intel x86 and ARM
Architecture?

 Instructions:

 Make a 20 mins YouTube recording explaining answers to the


above questions and Submit a link to your video through
muele platform.

 Make sure you submit your work by Friday 15th/09/2023.

CSC1104
+ Summary
Basic Concepts and
Computer Evolution
Chapter 1
 Organization and architecture
 Cloud computing
 Structure and function  Basic concepts
 Cloud services

 Brief history of computers


 Designing for performance
 The evolution of the Intel x86  Microprocessor speed
architecture  Performance balance
 Improvements in chip organization
 Embedded systems and architecture
 The Internet of things  Multicore

 Embedded operating systems  Basic measures of computer


performance
 Application processors versus
 Clock speed
dedicated processors
 Instruction execution rate

CSC1104
+
References
 WilliamStallings, Pearson Education. Computer
Organization and Architecture (Designing for
Performance), 10th edition, (2016)
 Chapter 1 (Sections 1.1, 1.2, 1.3, 1.4, 1.5, and 1.7)
 Chapter 2 ( sections 2.1, 2.2 and 2.4)

CSC1104
+
References for next topic
 William Stallings, Pearson Education. Computer Organization
and Architecture (Designing for Performance), 10th edition,
(2016) – Chapter 3

CSC1104

You might also like