02-x86ProcessorArchitecture-v02
02-x86ProcessorArchitecture-v02
INTRODUCTION-GENERAL
CONCEPTS
General Concepts
• Basic microcomputer design
• Instruction execution cycle
• Reading from memory
• How programs run
4
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Basic Microcomputer Design
• clock synchronizes CPU operations
• control unit (CU) coordinates sequence of execution
steps
• ALU performs arithmetic and bitwise processing
data bus
registers
control bus
address bus
5
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Clock
• synchronizes all CPU and BUS operations
• machine (clock) cycle measures time of a single
operation
• clock is used to trigger
one cycle events
6
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Instruction Execution Cycle
PC program
• Fetch I-1 I-2 I-3 I-4
fetch
• Decode memory
op1
read
op2
• Fetch registers registers
operands I-1
instruction
register
• Execute
decode
• Store output write
write
flags ALU
execute
(output)
7
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Multi-Stage Pipeline
• Pipelining makes it possible for processor to execute
instructions in parallel
• Instruction execution divided into discrete stages
Stages
Example of a non- S1 S2 S3 S4 S5 S6
pipelined 1 I-1
2 I-1
processor. Many 3 I-1
wasted cycles. Cycles 4
5
I-1
I-1
6 I-1
7 I-2
8 I-2
9 I-2
10 I-2
11 I-2
12 I-2
8
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Pipelined Execution
• More efficient use of cycles, greater throughput of instructions:
Stages
S1 S2 S3 S4 S5 S6
1 I-1 For k states and n
2 I-2 I-1
instructions, the
Cycles
3 I-2 I-1
4 I-2 I-1
number of required
5 I-2 I-1 cycles is:
6 I-2 I-1
7 I-2
k + (n – 1)
9
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Wasted Cycles (pipelined)
• When one of the stages requires two or more clock
cycles, clock cycles are again wasted.
Stages
exe
S1 S2 S3 S4 S5 S6
1 I-1
2 I-2 I-1 For k states and n
3 I-3 I-2 I-1
instructions, the
Cycles
1
Irvine, Kip R. Assembly Language for Intel-Based 0
Computers, 2003.
Superscalar
A superscalar processor has multiple execution pipelines.
In the following, note that Stage S4 has left and right
pipelines (u and v).
Stages
S4
S1 S2 S3 u v S5 S6
For k states and n
1 I-1 instructions, the
2 I-2 I-1 number of required
3 I-3 I-2 I-1
cycles is:
Cycles
10
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Reading from Memory
• Multiple machine cycles are required when reading from
memory, because it responds much more slowly than the CPU.
The steps are:
• address placed on address bus
• Read Line (RD) set low
• CPU waits one cycle for memory to respond
• Read Line (RD) goes to 1, indicating that the data is on the
data bus
Cycle 1 Cycle 2 Cycle 3 Cycle 4
CLK
Address
ADDR
RD
Data
DATA
11
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Cache Memory
• High-speed expensive static RAM both inside
and outside the CPU.
• Level-1 cache: inside the CPU
• Level-2 cache: outside the CPU
• Cache hit: when data to be read is already in
cache memory
• Cache miss: when data to be read is not in
cache memory.
12
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
How a Program Runs
User
sends program
name to
gets starting
cluster from
System
loads and path
starts
Directory
Program
entry
13
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Multitasking
• OS can run multiple programs at the same time.
• Multiple threads of execution within the same program.
• Scheduler utility assigns a given amount of CPU time to
each running program.
• Rapid switching of tasks
• gives illusion that all programs are running at once
• the processor must support task switching.
• When suspending a task, the processor automatically saves
the state of the EFLAGS register in the task state segment
(TSS) for the task being suspended. When binding itself to a
new task, the processor loads the EFLAGS register with
data from the new task’s TSS.
• When multitasking is implemented, individual tasks can use
different memory models.
14
15
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
16
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
17
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
18
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
CS223COA
19
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Modes of Operation
• Protected mode
• native mode (Windows, Linux)
• Real-address mode
• native MS-DOS
• System management mode
• power management, system security, diagnostics
• Virtual-8086 mode
• hybrid of Protected
• each program has its own 8086 computer
20
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
21
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Basic Execution Environment
• Addressable memory
• General-purpose registers
• Index and base registers
• Specialized register uses
• Status flags
• Floating-point, MMX, XMM registers
22
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Addressable Memory
• Protected mode
• 4 GB
• 32-bit address
• Real-address and Virtual-8086 modes
• 1 MB space
• 20-bit address
23
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
General-Purpose Registers
Named storage locations inside the CPU, optimized for
speed.
32-bit General-Purpose Registers
EAX EBP
EBX ESP
ECX ESI
EDX EDI
EFLAGS CS ES
SS FS
EIP
DS GS
24
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Accessing Parts of Registers
• Use 8-bit name, 16-bit name, or 32-bit name
• Applies to EAX, EBX, ECX, and EDX
8 8
AH AL 8 bits + 8 bits
AX 16 bits
EAX 32 bits
25
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Index and Base Registers
• Some registers have only a 16-bit name for their
lower half:
26
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Some Specialized Register Uses (1 of 2)
• General-Purpose
• EAX – accumulator
• ECX – loop counter
• ESP – stack pointer
• ESI, EDI – index registers
• EBP – extended frame pointer (stack)
• Segment
• CS – code segment
• DS – data segment
• SS – stack segment
• ES, FS, GS - additional segments
27
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Some Specialized Register Uses (2 of 2)
28
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Status Flags
• Carry
• unsigned arithmetic out of range
• Overflow
• signed arithmetic out of range
• Sign
• result is negative
• Zero
• result is zero
• Auxiliary Carry
• carry from bit 3 to bit 4
• Parity
• sum of 1 bits is an even number
29
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Floating-Point, MMX, XMM
•
Registers
Eight 80-bit floating-point data registers
• ST(0), ST(1), . . . , ST(7) ST(0)
ST(1)
• arranged in a stack
ST(2)
• used for all floating-point
ST(3)
arithmetic
ST(4)
• Eight 64-bit MMX registers
ST(5)
• Eight 128-bit XMM registers for single-
ST(6)
instruction multiple-data (SIMD) operations
ST(7)
• SIMD (Single Instruction Multiple Data) An
architecture with a single point of control that
execute the same instruction simultaneously
on multiple data values. e.g; vector
processors and array processors.
30
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
CS223COA
31
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Early Intel Microprocessors
• Intel 8080
• 64K addressable RAM
• 8-bit registers
• CP/M (Control Program for Microcomputers)
operating system
• S-100 BUS architecture
• 8-inch floppy disks!
• Intel 8086/8088
• IBM-PC Used 8088
• 1 MB addressable RAM
• 16-bit registers
• 16-bit data bus (8-bit for 8088)
• separate floating-point unit (8087)
32
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
The IBM-AT
• Intel 80286
• 16 MB addressable RAM
• Protected memory
• several times faster than 8086
• introduced IDE bus architecture
• 80287 floating point unit
33
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Intel IA-32 Family
• Intel386
• 4 GB addressable RAM, 32-bit registers, paging
(virtual memory)
• Intel486
• instruction pipelining
• Pentium
• superscalar, 32-bit address bus, 64-bit internal data path
34
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Intel P6
• Pentium Pro Family
• advanced optimization techniques in
microcode
• Pentium II
• MMX (multimedia) instruction set
• Pentium III
• SIMD (streaming extensions)
instructions
• Pentium 4
• NetBurst micro-architecture (Complete
redesign), tuned for multimedia
35
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
CISC and RISC
• CISC – complex instruction set
• large instruction set
• high-level operations
• requires microcode interpreter
• examples: Intel 80x86 family
• RISC – reduced instruction set
• simple, atomic instructions
• small instruction set
• directly executed by hardware
• examples:
• ARM (Advanced RISC
Machines)
• DEC Alpha (now Compaq) 36
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
37
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
38
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
39
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
40
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
CS223COA
• Real-address mode
• Calculating linear addresses
• Protected mode
• Multi-segment model
• Paging
41
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Real-Address mode
• 1 MB RAM maximum
addressable
• Application programs can
access any area of memory
• Single tasking
• Supported by MS-DOS
operating system
42
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Memory Units
42
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Segmented Memory
Here's the Problem:
• A 16-bit register can only hold values from 0 to 65,535
(i.e., 2¹⁶ = 64 KB).
• But early Intel CPUs (like 8086) were designed to
address up to 1 MB of memory (i.e., 2²⁰ = 1,048,576
bytes).
• So a single 16-bit register isn’t enough to uniquely
address anywhere in 1 MB space.
• How They Solved It: Segmented Memory
• To overcome this, Intel used segmentation:
• Memory addresses are formed using two 16-bit
values:
• A segment (e.g., CS, DS, SS, etc.)
• An offset
Segmented Memory
Segmented memory addressing: absolute (linear)
address is a combination of a 16-bit segment value
added to a 16-bit offset
F0000
E0000 8000:FFFF
D0000
C0000
B0000 one segment
A0000
90000
80000
70000
60000
8000:0250
50000
40000 0250
30000 8000:0000
20000
10000
seg ofs
e, Kip R. Assembly Language for Intel-Based Computers, 2003. 43
00000
Calculating Linear Addresses
• Given a segment address, multiply it by
16 (add a hexadecimal zero), and add it
to the offset
• Example: convert 08F1:0100 to a linear
address
Adjusted Segment value: 0 8 F 1 0
Add the offset: 0 1 0 0
Linear address: 0 9 0 1 0
44
Your turn . . .
What linear address corresponds to the
segment/offset address 028F:0030?
45
Calculating Linear Addresses
46
Protected Mode (1 of 2)
• 4 GB addressable RAM
• (00000000 to FFFFFFFFh)
• Each program assigned a memory
partition which is protected from other
programs
• Designed for multitasking
• Supported by Linux & MS-Windows
47
Protected mode (2 of 2)
48
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
Flat Segment Model
• Single global descriptor table (GDT).
• All segments mapped to entire 32-bit address space
FFFFFFFF
(4GB)
not used
Segment descriptor, in the
Global Descriptor Table
00040000
RAM
physical
00000000 00040 ----
00000000
49
Multi-Segment Model
• Each program has a local descriptor table (LDT)
• holds descriptor for each segment used by the program
RAM
26000
base limit access
00026000 0010
00008000 000A
00003000 0002 8000
3000
50
Multi-Segment Model
• Each program has a local descriptor table (LDT)
• holds descriptor for each segment used by the program
50
Paging
• Supported directly by the CPU
• Divides each segment into 4096-byte blocks called
pages
• Example, If Memory is 1 MB, 1 MB=
1024*1024/4096= 256 pages
• Sum of all programs can be larger than physical
memory
• Part of running program is in memory, part is on disk
• Virtual memory manager (VMM) – OS utility that
manages the loading and unloading of pages
• Page fault – issued by CPU when a page must be
loaded from disk
51
52
53
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.
54
Irvine, Kip R. Assembly Language for Intel-Based
Computers, 2003.