0% found this document useful (0 votes)

2 views52 pages

Pipeline and Vector processing

The document discusses pipeline and vector processing, focusing on parallel processing techniques and Flynn's classification of computers. It covers various types of parallel architectures, pipelining concepts, arithmetic pipelines, instruction pipelines, pipeline hazards, and solutions, as well as RISC pipelining and vector processing applications. The aim is to enhance computer processing speed and efficiency through simultaneous data handling and task execution.

Uploaded by

neupanejasmine0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views52 pages

Pipeline and Vector processing

Uploaded by

neupanejasmine0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Unit 6

Pipeline and vector processing

Contents
• Parallel Processing
• General Consideration for pipelining
• Pipelining
• Arithmetic Pipeline
• Instruction level Pipelining
• Pipeline Hazards and their Solution
• Array and Vector Processing
• RISC Pipelining
Parallel Processing

• A technique that is used to provide simultaneously data processing

task for the purpose of increasing the conceptual speed of computer
system.
• The system may have two or more ALUs and be able to execute two or
more instructions at the same time.
• The purpose of parallel processing is to speed up the computer
processing capability and increase its throughput (the amount of task
those are completed during given interval of time).
Flynn’s Classification of Computers
• Different ways to classify parallel computers.
• More widely used classifications, in use since 1966, is Flynn's
classification.
• Flynn's classification distinguishes multi-processor computer
architectures according to the two independent dimensions of
Instruction and Data.
• The sequence of instructions read from memory constitutes an
instruction stream.
• The operations performed on the data in the processor constitute a
data stream.
SISD SIMD
Single Instruction, Single Data Single Instruction, Multiple Data
MISD MIMD
Multiple Instruction, Single Data Multiple Instruction, Multiple Data
1. Single Instruction, Single Data (SISD):
• A serial (non-parallel) computer.
• Single Instruction: Only one instruction stream is being acted on by the CPU
during any One clock cycle.
• Single Data: Only one data stream is being used as input during any one clock
cycle.
• Single instruction is performed on a single set of data in a sequential form.
• Deterministic execution.
• This is the oldest and even today, the most common type of computer.
• Examples: Older generation mainframes, minicomputers and workstations;
most modern day PCs.
2. Single Instruction, Multiple Data (SIMD):
• A type of parallel computer.
• Single Instruction: All processing units executes the same instruction at any
given clock cycle.
• Multiple data: Each processing unit can operate on a different data element.
• Best suited for specialized problems characterized by a high degree of regularity,
such as graphics/image processing.
• Single instruction is performed on multiple data. A good example is the “For”
loop statement. Over here instruction is the same but the data stream is different.
• Example: Array processors: Connection Machine CM-2, MasPar
MP-1 & MP-2, ILLIAC IV
• Vector Processors: IBM 9000, Cray X-MP, Y-MP & C90, Fujitsu VP,
NEC SX-2, Hitachi S820, ETA10
3. Multiple Instruction, Single Data (MISD):
• A type of parallel computer.
• Multiple Instruction: Each processing unit operates on the data independently
via separate instruction streams.
• Single Data: A single data stream is fed into multiple processing units.
• N numbers of processors are working on different set of instruction on the same
set of data.
• Few actual examples of this class of parallel computer have ever existed. One is
the experimental Carnegie-Mellon C.mmp computer (1971).
• Some conceivable uses might be:
multiple frequency filters operating on a single signal stream.
Multiple cryptography algorithms attempting to crack a single coded
message.
4. Multiple Instruction, Multiple Data (MIMD):
• A type of parallel computer.
• Multiple Instruction: Every processor may be executing a different instruction
stream.
• Multiple Data: Every processor may be working with a different data stream.
• There is an interaction of N numbers of processors on a same data stream shared
by all processors.
• Execution can be synchronous or asynchronous, deterministic or non-
deterministic.
• Currently, the most common type of parallel computer - most modern
supercomputers fall into this category.
• Examples: Most current supercomputers, networked parallel computer clusters
and "grids", multi-processor SMP computers, multi-core PCs.
• Note: many MIMD architectures also include SIMD execution sub-components.
Pipelining
• A technique of decomposing a sequential process into sub operations,
with each sub process being executed in a special dedicated segment
that operates concurrently with all other segments.
• Each segment performs partial processing dictated by the way the task
partitioned.
• The result obtained from the computation in each segment is
transferred to the next segment in the pipeline.
• The final result is obtained after the data have passed through all
segments.
• It is characteristic of pipelines that several computations can be in
progress in distinct segments at the same time.
• The registers provide isolation between each segment so that each can
operate on distinct data simultaneously.

Suppose we want to perform the combined multiply and add operations with a
stream of numbers. E.g. Ai * Bi + Ci for i = 1 , 2, 3,…….,7
The sub operations performed in each segment of the pipeline are as follows:
R1Ai, R2Bi Input Ai and Bi
R3R1*R2, R4Ci Multiply and input Ci
R5R3+R4 Add Ci to product
Fig2: Example of pipeline processing
Clock Pulse Segment 1 Segment 2 Segment 3
R1 R2 R3 R4 R5
1 A1 B1 - - -
2 A2 B2 A1*B1 C1 -
3 A3 B3 A2*B2 C2 A1*B1+C1
4 A4 B4 A3*B3 C3 A2*B2+C2
5 A5 B5 A4*B4 C4 A3*B3+C3
6 A6 B6 A5*B5 C5 A4*B4+C4
7 A7 B7 A6*B6 C6 A5*B5+C5
8 - - A7*B7 C7 A6*B6+C6
9 - - - - A7*B7+C7
General Consideration for pipelining

Fig3: General structure of a 4-segment pipeline.

Fig: Space Time Diagram
Speedup Equation
 Consider a K-segment pipeline with a clock cycle time tp to execute n tasks. The first task T1
require time Ktp to complete. The remaining (n-1) tasks finish at the rate of one task per clock
cycle and will be completed after time (n-1) tp. The total time to complete the n task is [Ktp + (n -
1) tp] = (K + n - 1) tp.
 Consider a non–pipeline unit that performs the same operation and takes tn time to complete each
task. The total time required for n tasks would be ntn.

Speedup(S) =

S=
As number of tasks increases, n becomes much larger than K - 1, and (K + n – 1)
approaches to n then,
S=
Arithmetic Pipeline
• Usually found in very high speed computers.
• Used to implement floating point operations, multiplication of fixed
point numbers, and similar computations encountered in scientific
problems.
• Floating point addition and subtraction:
• Let us consider a two floating point binary inputs numbers to the adder
pipeline in normalized form.
X=A  2a
Y=B  2b
• Can be performed in four segments, as shown in figure.

Ex: X= 0.9504  103

Y= 0.8200  102
=3-2 = 1
The larger exponent 3 is chosen
as the exponent of the result.

3 X= 0.9504  103
Y= 0.0820  103

Sum=0.9504+0.0820=1.0324

4
0.10324

Z= 1.0324  103
Fig4: Pipeline for floating-point addition and subtraction
• The registers labeled R are placed between the segments to store
intermediate results.
• The sub-operations that are performed in the four segments are:
i. Compare the exponents.
ii. Align the mantissas.
iii. Add or subtract the mantissas.
iv. Normalize the result.
• The exponents are compared by subtracting them to determine their
difference.
• The larger exponent is chosen as the exponent of the result.
• The exponent difference determines how many times the mantissa
associated with the smaller exponent must be shifted to the right.
• This produces an alignment of the two mantissas.
• The two mantissas are added or subtracted in segment 3. The result is
normalized in segment 4.
• When an overflow occurs, the mantissa of the sum or difference is
shifted right and the exponent incremented by one.
• If an underflow occurs, the number of leading zeros in the mantissa
determines the number of left shifts in the mantissa and the number
that must be subtracted from the exponent.
Instruction pipeline
• Pipeline processing can occur not only in the data stream but in the
instruction as well.
• Generally, instruction fetch unit and an instruction execution unit
designed to provide a two-segment pipeline.
• Computers with complex instructions require other phases in addition to
above phases to process an instruction completely.
• In the most general case, the computer needs to process each instruction
with the following sequence of steps.
• Fetch the instruction from memory.
• Decode the instruction.
• Calculate the effective address.
• Fetch the operands from memory.
• Execute the instruction.
• Store the result in the proper place.
• Difficulties that will prevent the instruction pipeline from operating at
its maximum rate:
• Different segments may take different times to operate on available data.
• Some segments are skipped for certain operations.
• Two or more segments may require memory access at the same time, causing
one segment to wait until another is finished with the memory.
 4-stage Pipeline: To further speed up, the pipeline must have following more
stages:
a. Fetch instruction (FI)
b. Decode instruction (DI)

c. Fetch operands (FO)

d. Execute instruction (EI)
4-segment pipeline:
Conventional

Pipelined

1 2 3 4 5 6
i FI DI FO EI
i+1 FI DI FO EI
i+2 FI DI FO EI
Fig: Flowchart of 4-segment CPU pipeline
Fig: Space time diagram of instruction pipeline

• Each segments operates on different segments.

• In step 4, instruction 1 is being executed in segment EX; the operand for
instruction 2 is being fetched in segment FO; instruction 3 is being decoded in
segment DA; and instruction 4 is being fetched from memory in segment FI.
Pipeline Hazards and their Solution
Hazards
1. Resource conflicts
• If one common memory is used for both data and instruction, and there is need to
read/write data and fetch the instruction at the same time, the resource conflicts occur.
2. Data dependency conflict
• Data dependency conflicts arise when an instruction depends on the result of a
previous instruction but result of that instruction is not yet available.
3. Branch difficulties
• Branch difficulties arise from program control instructions that may change the value
of PC.
• A program is not a straight flow of sequential instructions. There may be branch
instructions after the normal flow of program which delay the pipelining executions
and affects the programs.
Solution of pipeline hazards
1. Resource conflicts can be solved by using separate instruction and
data memory.
2. To solve the data dependency conflict, we have following methods:
i) Hardware interlock
• A circuit that is responsible to detect the data dependency.
• After detecting the particular instruction needs data from instruction which is being executed,
hardware interface delays the instruction till the data is not available.
ii) Operand forwarding
• Operand forwarding uses special hardware to detect a conflict and then avoid by routing the data
through special paths between pipeline segments.
iii) Delayed load
• Delayed load is a procedure that gives the responsibilities for solving data conflicts to the
compiler.
• The compiler is designed to detect conflict & reorder the instructions as necessary to delay the
loading of the conflicting data by inserting no operation instructions.
3. Solution to branch difficulties
i. Pre fetch target instruction
• Both the branch target instruction & the instruction following the branch are pre fetched and are
saved until the branch instruction is executed.
• If branching occurs, then branch target instruction continues.
ii. Branch target buffer (BTB)
• An associative memory included in fetch segment of branch instruction that stores the target situation
for the previously executed branch.
• Also stores the next few instructions after the branch target instruction.
iii. Loop buffer
• Similar to BTB but its speed is high.
• Program loops are stored in the loop buffer and can be executed directly without having access to
memory.
iv. Branch prediction
• Special hardware is used to detect the branch in the conditional branch instruction. On the basis of
prediction, the instructions are pre fetched.
v. Delayed branch
• Compiler detects the branch instructions, so it re-arranges the instruction to make delay by inserting
no operation instruction.
RISC pipeline
• More efficient and most of the operations need single clock cycle.
• Data transfer instructions in RISC - load and store.
• Use register indirect addressing.
• Usually need three or four stages in the pipeline.
• Uses two separate buses for storing instruction and data and to avoid conflicts.
• Cache memory is available for high speed operation.
• Most of the instruction executes in single cycle.
• A typical set of instructions are:
Data manipulation,
Data transfer and
Program control
RISC 3-segment instruction pipelining:
• Thee are three types of instructions:
The data manipulation instructions: operate on data in processor registers
The data transfer instructions.
The program control instructions.
• The instruction cycle can be divided into three sub-operations and
implemented in three segments:
I: Instruction fetch
A: ALU operation
E: Execute instruction
Delayed Load
• Consider the operation of the following four instructions:
1. LOAD: Rl  M[address 1]
2. LOAD: R2  M[address 2]
3. ADD: R3Rl + R2
4. STORE: M[address 3] R3
• There will be a data conflict in instruction 3 because the operand in
R2 is not yet available in the A segment (fig: a).
• The E segment in clock cycle 4 is in a process of placing the memory
data into R2.
• The A segment in clock cycle 4 is using the data from R2.
Fig: 3-segment pipelining timing
• It is up to the compiler to make sure that the instruction following the load
instruction uses the data fetched from memory.
• This concept of delaying the use of the data loaded from memory is referred
to as delayed load (fig: b).
• The no-op instruction is used to advance one clock cycle in order to
compensate for the data conflict in the pipeline.
• The advantage of the delayed load approach is that the data
dependency is taken care of by the compiler rather than the hardware.

Delay branch
• RISC machines use different techniques to redefine the branches so
that they take effect at the proper time in the pipeline.
• This method is referred to as delayed branch.
The program for this example consists of Five instructions:
i. Load from memory to R1
ii. Increment R2
iii.Add R3 to R4
iv. Subtract R5 from R6
v. Branch to address X
Vector Processing
• In many science and engineering applications, the problems can be
formulated in terms of vectors and matrices that lend themselves to
vector processing.
• Applications:
 Long-range weather forecasting
 Petroleum explorations
 Seismic data analysis
 Medical diagnosis
 Aerodynamics and space flight simulations
 Artificial intelligence and expert systems
 Mapping the human genome
 Image processing
• To achieve the required level of high performance, it is necessary to utilize
the fastest and most reliable hardware and apply innovative procedures from
vector and parallel processing techniques.

Vector Operation
• Many scientific problems require arithmetic operations on large arrays of
numbers.
• A vector is an ordered set of a one-dimensional array of data items.
• A vector V of length n is represented as a row vector by V=[v1,v2,…,Vn].
• Column and Row vector.

Fig: Vector instruction format

• A computer capable of vector processing eliminates the overhead associated
with the time it takes to fetch and execute the instructions in the program.
Matrix multiplication
• One of the most computational intensive operations performed in
computers with vector processors.
• The multiplication of two n x n matrices consists of n2 inner products
or n3 multiply-add operations.
• An n x m matrix of numbers has n rows (vectors) and m columns
(vectors). Consider, for example, the multiplication of two 3 x 3
matrices A and B.
𝑎11 𝑎12 𝑎13 𝑏11 𝑏12 𝑏13 𝑐11 𝑐12 𝑐13
𝑎21 𝑎22 𝑎23 × 𝑏21 𝑏22 𝑏23 = 𝑐21 𝑐22 𝑐23
𝑎31 𝑎32 𝑎33 𝑏31 𝑏32 𝑏33 𝑐31 𝑐32 𝑐33
• In general, the inner product consists of the sum of k product terms of the form
C= A1B1+A2B2+A3B3+…+AkBk.
 In a typical application k may be equal to 100 or even 1000.
 The inner product calculation on a pipeline vector processor is shown below:

Fig: Pipeline for calculating an inner product

Memory Interleaving
• Pipeline and vector processors often require simultaneous access to memory from
two or more sources.
 An instruction pipeline may require the fetching of an instruction and an operand at the
same time from two different segments.
An arithmetic pipeline usually requires two or more operands to enter the
pipeline at the same time.
• Instead of using two memory buses for simultaneous access, the memory can be
partitioned into a number of modules connected to a common memory address and
data buses.
 A memory module is a memory array together with its own address and data registers.
• The advantage of a modular memory is that it allows the use of a technique called
interleaving.
Fig: Multiple module memory organization

• In an interleaved memory, different sets of addresses are assigned to different

memory modules.
• By staggering the memory access, the effective memory cycle time can be reduced
by a factor close to the number of modules.
Super Computers
• A commercial computer with vector instructions and pipelined
floating-point arithmetic operations is referred to as a supercomputer.
• To speed up the operation, the components are packed tightly together
to minimize the distance that the electronic signals have to travel.
• This is augmented by instructions that process vectors and
combinations of scalars and vectors.
• A computer system best known for its high computational speed, fast
and large memory systems, and the extensive use of parallel
processing.
• It is equipped with multiple functional units and each unit has its own
pipeline configuration.
• It is specifically optimized for the type of numerical calculations involving
vectors and matrices of floating-point numbers.
• They are limited in their use to a number of scientific applications, such as
numerical weather forecasting, seismic wave analysis, and space research.
• A measure used to evaluate computers in their ability to perform a given number
of floating-point operations per second is referred to as flops.
• A typical supercomputer has a basic cycle time of 4 to 20 ns.
• The examples of supercomputer:
• Cray-1: It uses vector processing with 12 distinct functional units in parallel;
a large number of registers (over 150); multiprocessor configuration (Cray X-
MP and Cray Y-MP)
• Fujitsu VP-200: 83 vector instructions and 195 scalar instructions; 300
megaflops
Array Processing

• An array processor is a processor that performs computations on large

arrays of data.
• The term is used to refer to two different types of processors.
• Attached array processor:
• Is an auxiliary processor.
• It is intended to improve the performance of the host computer in specific
numerical computation tasks.
• SIMD array processor:
• Has a single-instruction multiple-data organization.
• It manipulates vector instructions by means of multiple functional units
responding to a common instruction.
Attached Array Processor

• Its purpose is to enhance the performance of the computer by

providing vector processing for complex scientific applications.
Parallel processing with multiple functional units
• For example, when the FSP-164/MAX (Attached processor for VAX
system) is attached to a VAX 11 computer, the Floating-Point Systems
operation increases the computing power of the VAX to100
megaFLOPS.
• The objective of the attached array processor is to provide vector
manipulation capabilities to a conventional computer at a fraction of
the cost of supercomputer.
Fig: Attached array processor with host computer

megaFLOPS (floating point operations per second) : measure of computer performance, useful in
fields of scientific computations that require floating-point calculations.
For such cases it is a more accurate measure than measuring instructions per second.
SIMD Array Processor
• An SIMD array processor is a computer with multiple processing units
operating in parallel.
• Contains a set of identical processing elements (PEs), each having a local memory
M.
• Each PE includes an ALU, a floating-point arithmetic unit, and working registers.
• Vector instructions are broadcast to all PEs simultaneously.
• Masking schemes are used to control the status of each PE during the execution of
vector instructions.
• Each PE has a flag that is set when the PE is active and reset when the PE is
inactive.
• For example, the ILLIAC IV computer developed at the University of Illinois and
manufactured by the Burroughs Corp.
• Are highly specialized computers.
• They are suited primarily for numerical problems that can be expressed in vector
or matrix form.
Fig: SIMD array processor organization

Explaining The Christianization of The Roman Empire Older Theories and Recent Developments by Danny PRAET (Ghent) PDF
100% (1)
Explaining The Christianization of The Roman Empire Older Theories and Recent Developments by Danny PRAET (Ghent) PDF
115 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
Chapter 5 Pipelining and Vector Processing Modified
No ratings yet
Chapter 5 Pipelining and Vector Processing Modified
37 pages
Unit 5-2 COA
No ratings yet
Unit 5-2 COA
52 pages
Coa Unit 5
No ratings yet
Coa Unit 5
20 pages
Unit-V NEW
No ratings yet
Unit-V NEW
21 pages
Unit-5-Parallel Processing
No ratings yet
Unit-5-Parallel Processing
11 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
Parallel Processing
No ratings yet
Parallel Processing
33 pages
Unit 6 COA
No ratings yet
Unit 6 COA
37 pages
Pipelining Vector Processing
No ratings yet
Pipelining Vector Processing
27 pages
Unit 6 - Pipeline, Vector Processing and Multiprocessors
No ratings yet
Unit 6 - Pipeline, Vector Processing and Multiprocessors
23 pages
CSO Lecture Notes Unit - 5
No ratings yet
CSO Lecture Notes Unit - 5
11 pages
Coa Mod 4 5
No ratings yet
Coa Mod 4 5
91 pages
Chapter 3 - Pipelining-And-Vector-Processing
100% (1)
Chapter 3 - Pipelining-And-Vector-Processing
29 pages
Pipelining
No ratings yet
Pipelining
33 pages
Chapter 5 - CO - BIM - III
No ratings yet
Chapter 5 - CO - BIM - III
7 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Ca Unit 2.2
100% (2)
Ca Unit 2.2
22 pages
COAU5
No ratings yet
COAU5
31 pages
Unit 5
No ratings yet
Unit 5
51 pages
Unit-4 Pipelinie and Vector Processing
No ratings yet
Unit-4 Pipelinie and Vector Processing
33 pages
5 Marks Q. Describe Array Processor Architecture
No ratings yet
5 Marks Q. Describe Array Processor Architecture
11 pages
Pipelining 2
No ratings yet
Pipelining 2
43 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
30 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
28 pages
Unit 7 N
No ratings yet
Unit 7 N
13 pages
Unit-6 Pipelining
No ratings yet
Unit-6 Pipelining
63 pages
Chapter 9
No ratings yet
Chapter 9
28 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
ch.9 Pipeline MoDIFIED
No ratings yet
ch.9 Pipeline MoDIFIED
76 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
39 pages
BCA Semester II Computer Organisation and Architecture (COA
No ratings yet
BCA Semester II Computer Organisation and Architecture (COA
24 pages
Pipeline - 3117
No ratings yet
Pipeline - 3117
22 pages
5.1-5.3 Pipelining and Parallel Processing
No ratings yet
5.1-5.3 Pipelining and Parallel Processing
56 pages
COA UNIT-3 PART-2
No ratings yet
COA UNIT-3 PART-2
35 pages
Lecture 10
No ratings yet
Lecture 10
23 pages
Csa - Unit-4
No ratings yet
Csa - Unit-4
9 pages
UNIT-5: Pipeline and Vector Processing
No ratings yet
UNIT-5: Pipeline and Vector Processing
63 pages
Unit 5
No ratings yet
Unit 5
36 pages
Campmc Unit Ii
No ratings yet
Campmc Unit Ii
61 pages
Unit5 Parallel Processing Multiprocessor
No ratings yet
Unit5 Parallel Processing Multiprocessor
32 pages
Chap. 9 Pipeline and Vector Processing
0% (1)
Chap. 9 Pipeline and Vector Processing
12 pages
Pipeline and Vector Processing
83% (12)
Pipeline and Vector Processing
37 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
Coa Unit 5
No ratings yet
Coa Unit 5
71 pages
1.4-Parallel Computer Architecture
No ratings yet
1.4-Parallel Computer Architecture
22 pages
Pipelining and Vector Processing Chapter 9
100% (6)
Pipelining and Vector Processing Chapter 9
29 pages
Comp Architecture Chapter 4 - Pipelining
No ratings yet
Comp Architecture Chapter 4 - Pipelining
53 pages
COA DR MVN 5 UNIT - Latest PDF
No ratings yet
COA DR MVN 5 UNIT - Latest PDF
24 pages
Unit 5
No ratings yet
Unit 5
23 pages
UNIT-4 - Pipelining & Parallel Processing
No ratings yet
UNIT-4 - Pipelining & Parallel Processing
34 pages
Pipelining and Vector Processing: - Parallel
No ratings yet
Pipelining and Vector Processing: - Parallel
37 pages
Parallel Processing Parallel Processing
No ratings yet
Parallel Processing Parallel Processing
64 pages
Unit 5 (Coa) Notes
No ratings yet
Unit 5 (Coa) Notes
35 pages
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
C Programming
From Everand
C Programming
Netra
No ratings yet
Unit 2 Introduction to Finite Automata
No ratings yet
Unit 2 Introduction to Finite Automata
43 pages
The Relational Algebra and Relational Calculus
No ratings yet
The Relational Algebra and Relational Calculus
18 pages
Unit 1_IT_N
No ratings yet
Unit 1_IT_N
24 pages
Unit 3_IT_N
No ratings yet
Unit 3_IT_N
87 pages
Technical Questions: Calypso
No ratings yet
Technical Questions: Calypso
17 pages
Arcanus Ordo Nigri Solis
100% (3)
Arcanus Ordo Nigri Solis
40 pages
Icfes Diapositiva 1
No ratings yet
Icfes Diapositiva 1
6 pages
Historical Books-Lecture Notes-1
No ratings yet
Historical Books-Lecture Notes-1
36 pages
Lesson 1 UCSP
No ratings yet
Lesson 1 UCSP
1 page
Action Research MARICEL
No ratings yet
Action Research MARICEL
25 pages
ECA1 Tests Grammar Check 4 AB
No ratings yet
ECA1 Tests Grammar Check 4 AB
1 page
De Guzman (2000) - Some Remarks On The Grammatical Functions of The Nonabsolutive Agent in Tagalog
No ratings yet
De Guzman (2000) - Some Remarks On The Grammatical Functions of The Nonabsolutive Agent in Tagalog
18 pages
(M1S2-POWERPOINT) Alphabets
No ratings yet
(M1S2-POWERPOINT) Alphabets
37 pages
02 Ruth Stiehl, The Origin of The Cult Serapis, History of Religions 3.1, 1963, 21-33.
No ratings yet
02 Ruth Stiehl, The Origin of The Cult Serapis, History of Religions 3.1, 1963, 21-33.
14 pages
Systems Ap Storage Tape ts3400 TSD03010USEN
No ratings yet
Systems Ap Storage Tape ts3400 TSD03010USEN
4 pages
Wine and The Bible
No ratings yet
Wine and The Bible
4 pages
Chapter 1 - Software Testing (Lecture 1 & 2)
No ratings yet
Chapter 1 - Software Testing (Lecture 1 & 2)
68 pages
Teacher Education Lesson Plan Template
No ratings yet
Teacher Education Lesson Plan Template
4 pages
Naits Jee 2023
No ratings yet
Naits Jee 2023
1 page
1
No ratings yet
1
8 pages
Studies On Daniel 259
No ratings yet
Studies On Daniel 259
1 page
Quiz Periods in Phil Lit
No ratings yet
Quiz Periods in Phil Lit
4 pages
4.3 The Three Evils: Introduction To Sysml and Systems Modelling
No ratings yet
4.3 The Three Evils: Introduction To Sysml and Systems Modelling
5 pages
Ost
No ratings yet
Ost
21 pages
IBM Cognos 8 Business Intelligence: Transformer User Guide
No ratings yet
IBM Cognos 8 Business Intelligence: Transformer User Guide
435 pages
Whitman's Poetry "Song of Myself" Summary & Analysis - SparkNotes
No ratings yet
Whitman's Poetry "Song of Myself" Summary & Analysis - SparkNotes
4 pages
Session 3 Unit 8-10 English I
No ratings yet
Session 3 Unit 8-10 English I
28 pages
MATLAB Array
No ratings yet
MATLAB Array
7 pages
First Discoveries Workbook Unit 8
No ratings yet
First Discoveries Workbook Unit 8
6 pages
Rami 4 0 en
No ratings yet
Rami 4 0 en
40 pages
Rainbow Workbook PDF
No ratings yet
Rainbow Workbook PDF
168 pages
Lesson Plan Since Green Color
100% (1)
Lesson Plan Since Green Color
3 pages
Unit 3 NLP
No ratings yet
Unit 3 NLP
9 pages

Pipeline and Vector processing

Uploaded by

Pipeline and Vector processing

Uploaded by

Unit 6

Pipeline and vector processing

• A technique that is used to provide simultaneously data processing

Fig3: General structure of a 4-segment pipeline.

Ex: X= 0.9504  103

c. Fetch operands (FO)

• Each segments operates on different segments.

Fig: Vector instruction format

Fig: Pipeline for calculating an inner product

• In an interleaved memory, different sets of addresses are assigned to different

• An array processor is a processor that performs computations on large

• Its purpose is to enhance the performance of the computer by

You might also like