0% found this document useful (0 votes)

114 views7 pages

Cse410 10 Pipelining A

This document summarizes the execution cycle and pipelining in a 5-stage MIPS processor. It discusses how instructions are fetched, decoded, executed, access memory, and write results back in separate pipeline stages. Pipelining allows overlapping execution of multiple instructions to improve throughput. However, data and structural hazards can occur when instructions depend on results not yet available or compete for hardware resources. Solutions like deeper pipelines, separate caches, and forwarding of register values between stages help address these hazards and maximize performance.

Uploaded by

Purnay Barge

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

114 views7 pages

Cse410 10 Pipelining A

Uploaded by

Purnay Barge

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Execution Cycle

IF ID EX MEM WB

Pipelining
1. Instruction Fetch
2. Instruction Decode
CSE 410, Spring 2005 3. Execute
Computer Systems 4. Memory
5. Write Back
http://www.cs.washington.edu/410

IF and ID Stages Simple MIPS Instruction Formats

1. Instruction Fetch
R op code source 1 source 2 dest shamt function
» Get the next instruction from memory
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
» Increment Program Counter value by 4
2. Instruction Decode I op code base reg src/dest offset or immediate value
» Figure out what the instruction says to do 6 bits 5 bits 5 bits 16 bits

» Get values from the named registers

» Simple instruction format means we know which J op code word offset

registers we may need before the instruction is 6 bits 26 bits

fully decoded
EX, MEM, and WB stages Example: add $s0, $s1, $s2
3. Execute • IF get instruction at PC from memory
» On a memory reference, add up base and offset op code source 1 source 2 dest shamt function
» On an arithmetic instruction, do the math 000000 10001 10010 10000 00000 100000

4. Memory Access
• ID determine what instruction is and read
» If load or store, access memory registers
» If branch, replace PC with destination address » 000000 with 100000 is the add instruction
» Otherwise do nothing » get contents of $s1 and $s2 (eg: $s1=7, $s2=12)
5. Write back • EX add 7 and 12 = 19
» Place the results in the appropriate register • MEM do nothing for this instruction
• WB store 19 in register $s0

Example: lw $t2, 16($s0) Latency & Throughput

• IF get instruction at PC from memory 1 2 3 4 5 6 7 8 9 10
op code base reg src/dest offset or immediate value IF ID EX MEM WB inst 1
010111 10000 01000 0000000000010000 IF ID EX MEM WB inst 2
• ID determine what 010111 is Latency—the time it takes for an individual instruction to execute
» 010111 is lw What’s the latency for this implementation?
» get contents of $s0 and $t2 (we don’t know that we One instruction takes 5 clock cycles
don’t care about $t2) $s0=0x200D1C00, $t2=77763 Cycles per Instruction (CPI) = 5
• EX add 16 to 0x200D1C00 = 0x200D1C10 Throughput—the number of instructions that execute per unit time
• MEM load the word stored at 0x200D1C10 What’s the throughput of this implementation?
One instruction is completed every 5 clock cycles
• WB store loaded value in $t2 Average CPI = 5
A case for pipelining Pipelined Latency & Throughput
• If execution is non-overlapped, the functional 1 2 3 4 5 6 7 8 9
units are underutilized because each unit is used IF ID EX MEM WB inst 1
only once every five cycles IF ID EX MEM WB inst 2
• If Instruction Set Architecture is carefully IF ID EX MEM WB inst 3
designed, organization of the functional units IF ID EX MEM WB inst 4
can be arranged so that they execute in parallel IF ID EX MEM WB inst 5
• Pipelining overlaps the stages of execution so
every stage has something to do each cycle • What’s the latency of this implementation?
• What’s the throughput of this implementation?

Pipelined Analysis Throughput is good!

• A pipeline with N stages could improve
throughput by N times, but
» each stage must take the same amount of time overlapped
» each stage must always have work to do
increasing
» there may be some overhead to implement number of
• Also, latency for each instruction may go up instructions
» Within some limits, we don’t care
sequential

increasing time
MIPS ISA: Born to Pipeline Memory accesses
• Instructions all one length • Efficient pipeline requires each stage to
» simplifies Instruction Fetch stage take about the same amount of time
• Regular format • CPU is much faster than memory hardware
» simplifies Instruction Decode • Cache is provided on chip
• Few memory operands, only registers » i-cache holds instructions
» only lw and sw instructions access memory » d-cache holds data
• Aligned memory operands » critical feature for successful RISC pipeline
» only one memory access per operand » more about caches next week

The Hazards of Parallel Activity Design for Speed

• Any time you get several things going at once, • Most of what we talk about next relates to the
you run the risk of interactions and CPU hardware itself
dependencies » problems keeping a pipeline full
» juggling doesn’t take kindly to irregular events » solutions that are used in the MIPS design
• Unwinding activities after they have started • Some programmer visible effects remain
can be very costly in terms of performance » many are hidden by the assembler or compiler
» drop everything on the floor and start over » the code that you write tells what you want done,
but the tools rearrange it for speed
Pipeline Hazards Structural Hazards
• Structural hazards • Concurrent instructions want same resource
» Instructions in different stages need the same » lw instruction in stage four (memory access)
resource, eg, memory » add instruction in stage one (instruction fetch)
• Data hazards » Both of these actions require access to
» data not available to perform next operation memory; they would collide if not designed for
• Control hazards • Add more hardware to eliminate problem
» data not available to make branch decision » separate instruction and data caches
• Or stall (cheaper & easier), not usually
done

Data Hazards Stall for register data dependency

• When an instruction depends on the results
of a previous instruction still in the pipeline
• Stall the pipeline until the result is available
• This is a data dependency $s0 is
written here » this would create a 3-cycle pipeline bubble

add $s0, $s1, $s2 IF ID EX MEM WB

add s0,s1,s2 IF ID EX MEM WB
add $s4, $s3, $s0 IF ID EX MEM WB add s4,s3,s0 IF ID EX MEM WB
stall
$s0 is
read here
Read & Write in same Cycle Solution: Forwarding
• The value of $s0 is known internally after cycle 3
(after the first instruction’s EX stage)
• Write the register in the first part of the clock
cycle • The value of $s0 isn’t needed until cycle 4 (before
• Read it in the second part of the clock cycle the second instruction’s EX stage)
• A 2-cycle stall is still required • If we forward the result there isn’t a stall
write $s0

add s0,s1,s2 IF ID EX MEM WB read $s0 add s0,s1,s2 IF ID EX MEM WB

add s4,s3,s0 IF stall ID EX MEM WB
add s4,s3,s0 IF ID EX MEM WB

Another data hazard Stall for lw hazard

• What if the first instruction is lw?
• s0 isn’t known until after the MEM stage • We can stall for one cycle, but we hate to stall
» We can’t forward back into the past
• Either stall or reorder instructions
lw s0,0(s2) IF ID EX MEM WB

lw s0,0(s2) IF ID EX MEM WB add s4,s3,s0 IF ID stall EX MEM WB

NO!
add s4,s3,s0 IF ID EX MEM WB
Instruction Reorder for lw hazard Reordering Instructions

• Try to execute an unrelated instruction • Reordering instructions is a common

between the two instructions technique for avoiding pipeline stalls
• Static reordering
» programmer, compiler and assembler do this
• Dynamic reordering
lw s0,0(s2) IF ID EX MEM WB
» modern processors can see several instructions
sub t4,t2,t3 IF ID EX MEM WB
» they execute any that have no dependency
add s4,s3,s0 IF ID EX MEM WB » this is known as out-of-order execution and is
sub t4,t2,t3
complicated to implement, but effective

Operation and Maintenance of Pumping Machinery
100% (4)
Operation and Maintenance of Pumping Machinery
57 pages
Pipelining Basic and Intermediate Concepts
No ratings yet
Pipelining Basic and Intermediate Concepts
75 pages
Pipeline Processor Design
No ratings yet
Pipeline Processor Design
89 pages
Chapter4 Pipelining END FA11
No ratings yet
Chapter4 Pipelining END FA11
84 pages
4 29 03 ImplementingMIPS 0429
No ratings yet
4 29 03 ImplementingMIPS 0429
45 pages
Black & Decker The Complete Photo Guide Homeowner Basics
100% (8)
Black & Decker The Complete Photo Guide Homeowner Basics
498 pages
Lect8 Pipelined DP Control
No ratings yet
Lect8 Pipelined DP Control
59 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
No ratings yet
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
81 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
38 pages
MIPS Pipeline: Data and Control Path Data and Control Path
No ratings yet
MIPS Pipeline: Data and Control Path Data and Control Path
46 pages
Moduel 5
No ratings yet
Moduel 5
46 pages
CA Lecture 12
No ratings yet
CA Lecture 12
48 pages
Module 5 - Processor Structure and Function
No ratings yet
Module 5 - Processor Structure and Function
74 pages
Cse410 10 Pipelining A
No ratings yet
Cse410 10 Pipelining A
27 pages
Chapter 4
No ratings yet
Chapter 4
78 pages
4.4 Pipelining
No ratings yet
4.4 Pipelining
39 pages
MIPS
No ratings yet
MIPS
70 pages
3-Pipelining 241110 203716
No ratings yet
3-Pipelining 241110 203716
59 pages
8 Pipeline DDP Control
No ratings yet
8 Pipeline DDP Control
54 pages
Pipe 1 New
No ratings yet
Pipe 1 New
64 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
L03 Pipelining
No ratings yet
L03 Pipelining
45 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
77 pages
B737 Bridging PDF
No ratings yet
B737 Bridging PDF
338 pages
Pipelining
No ratings yet
Pipelining
32 pages
L04 Pipelining
No ratings yet
L04 Pipelining
38 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
26 pages
02a ILP Pipeline
No ratings yet
02a ILP Pipeline
40 pages
Lec 7 CSE-509 Pipelining
No ratings yet
Lec 7 CSE-509 Pipelining
27 pages
Chapter 2 Lecture 4 and 5
No ratings yet
Chapter 2 Lecture 4 and 5
56 pages
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
No ratings yet
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
64 pages
Pipelining
No ratings yet
Pipelining
24 pages
Embedded Computer Architecture 5SAI0
No ratings yet
Embedded Computer Architecture 5SAI0
59 pages
DDCO Jan25 Unit5
No ratings yet
DDCO Jan25 Unit5
30 pages
CODch 6 Slides
No ratings yet
CODch 6 Slides
77 pages
12 - Processor Structure and Function
No ratings yet
12 - Processor Structure and Function
73 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
39 pages
Chapter 4 The Processor
No ratings yet
Chapter 4 The Processor
72 pages
L15 MipsPipeline
No ratings yet
L15 MipsPipeline
26 pages
Embedded Systems Design: Pipelining and Instruction Scheduling
No ratings yet
Embedded Systems Design: Pipelining and Instruction Scheduling
48 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
50 pages
Ca06 2014 PDF
No ratings yet
Ca06 2014 PDF
53 pages
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
74 pages
Basic Pipelining: CS2100 - Computer Organization
No ratings yet
Basic Pipelining: CS2100 - Computer Organization
83 pages
Pipelining
No ratings yet
Pipelining
44 pages
Coa Lecture Unit 3 Pipelining
No ratings yet
Coa Lecture Unit 3 Pipelining
95 pages
Ca07 2014 PDF
No ratings yet
Ca07 2014 PDF
56 pages
16.482 / 16.561 Computer Architecture and Design: Instructor: Dr. Michael Geiger Fall 2013
No ratings yet
16.482 / 16.561 Computer Architecture and Design: Instructor: Dr. Michael Geiger Fall 2013
42 pages
What Is The Most Boring Household Activity?
No ratings yet
What Is The Most Boring Household Activity?
27 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
205C、225C Electrical system
No ratings yet
205C、225C Electrical system
108 pages
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
No ratings yet
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
21 pages
Cpe 242 Computer Architecture and Engineering Instruction Level Parallelism
No ratings yet
Cpe 242 Computer Architecture and Engineering Instruction Level Parallelism
46 pages
Unit - 1 Microprocessor Architecture
No ratings yet
Unit - 1 Microprocessor Architecture
52 pages
L14 MipsPipeline Ovw
No ratings yet
L14 MipsPipeline Ovw
17 pages
5.1 Remote Kit
No ratings yet
5.1 Remote Kit
21 pages
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
No ratings yet
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
35 pages
DDCO Notes-162-171
No ratings yet
DDCO Notes-162-171
10 pages
Processor Organization & Instruction Cycle
No ratings yet
Processor Organization & Instruction Cycle
31 pages
CS530 Fall2015 Lecture9
No ratings yet
CS530 Fall2015 Lecture9
5 pages
A4 版本1 （未使用）
No ratings yet
A4 版本1 （未使用）
2 pages
Advanced Materials
No ratings yet
Advanced Materials
379 pages
EET 305 Chapter 5 The Load in Power System
No ratings yet
EET 305 Chapter 5 The Load in Power System
70 pages
Optima TLX
No ratings yet
Optima TLX
57 pages
Tata Aria Infotainment System
100% (2)
Tata Aria Infotainment System
64 pages
DVR Decodificador tw2968
No ratings yet
DVR Decodificador tw2968
197 pages
This Report Describes My Internship at SMEC LABS
No ratings yet
This Report Describes My Internship at SMEC LABS
49 pages
A4WAS LA-C611PR10 0717A To Acer
No ratings yet
A4WAS LA-C611PR10 0717A To Acer
60 pages
An Deltav Sis Bms
No ratings yet
An Deltav Sis Bms
5 pages
PH10 PLUS Installation and User's Guide
No ratings yet
PH10 PLUS Installation and User's Guide
51 pages
Project Report B.E. ENTC
No ratings yet
Project Report B.E. ENTC
48 pages
Project Report
No ratings yet
Project Report
29 pages
OXIMETRO Masimo Rad-57c Rad-57m Rad-57cm
No ratings yet
OXIMETRO Masimo Rad-57c Rad-57m Rad-57cm
22 pages
Upcat Schedule - Aug 1 and 2, 2009
100% (3)
Upcat Schedule - Aug 1 and 2, 2009
6 pages
Musha PDF
No ratings yet
Musha PDF
6 pages
E7515A UXM Wireless Test Set 5991-4078EN
No ratings yet
E7515A UXM Wireless Test Set 5991-4078EN
16 pages
1-Introduction EEET2045-2155 Week 1
No ratings yet
1-Introduction EEET2045-2155 Week 1
15 pages
Modeling and Analysis of Series-Parallel Switched-Capacitor Voltage Equalizer For BatterySupercapacitor Strings
No ratings yet
Modeling and Analysis of Series-Parallel Switched-Capacitor Voltage Equalizer For BatterySupercapacitor Strings
7 pages
Aloysius
No ratings yet
Aloysius
11 pages
Lab 1 - ENEL 280 - Eletrical Circuits
No ratings yet
Lab 1 - ENEL 280 - Eletrical Circuits
4 pages
Electrical Engineering Text Listing - 2008
No ratings yet
Electrical Engineering Text Listing - 2008
2 pages
Fundamentals of Semiconductor
No ratings yet
Fundamentals of Semiconductor
12 pages
Synthesis, Structure, and Electrochemistry of Ag-Modified Limn O Cathode Materials For Lithium-Ion Batteries
No ratings yet
Synthesis, Structure, and Electrochemistry of Ag-Modified Limn O Cathode Materials For Lithium-Ion Batteries
5 pages
Micron Numnand
No ratings yet
Micron Numnand
1 page
Lattice Cable
No ratings yet
Lattice Cable
3 pages
NCR Format
No ratings yet
NCR Format
1 page
Node.js, JavaScript, API: Interview Questions and Answers
From Everand
Node.js, JavaScript, API: Interview Questions and Answers
John Edward Cooper Berg
5/5 (1)
Node.js 63 Interview Questions and Answers
From Everand
Node.js 63 Interview Questions and Answers
John Edward Cooper Berg
No ratings yet

Cse410 10 Pipelining A

Uploaded by

Cse410 10 Pipelining A

Uploaded by

Execution Cycle

IF and ID Stages Simple MIPS Instruction Formats

» Get values from the named registers

registers we may need before the instruction is 6 bits 26 bits

Example: lw $t2, 16($s0) Latency & Throughput

Pipelined Analysis Throughput is good!

The Hazards of Parallel Activity Design for Speed

Data Hazards Stall for register data dependency

add $s0, $s1, $s2 IF ID EX MEM WB

add s0,s1,s2 IF ID EX MEM WB read $s0 add s0,s1,s2 IF ID EX MEM WB

Another data hazard Stall for lw hazard

lw s0,0(s2) IF ID EX MEM WB add s4,s3,s0 IF ID stall EX MEM WB

• Try to execute an unrelated instruction • Reordering instructions is a common

You might also like