Unassessed Tutorial Exercise 2: Assessing A Vector Processing Enhancement
Unassessed Tutorial Exercise 2: Assessing A Vector Processing Enhancement
Unassessed Tutorial Exercise 2: Assessing A Vector Processing Enhancement
for i := 1 to 64 do
LV V2,X(R1) ; load row of X (R1 is i*N, where N is row length)
MULTSV V3,a,V2 ; multiply each element of the row vector by scalar 'a'
LV V1,Y(R2) ; load k'th row of Y, where R2 is k*N)
ADDV V4,V3,V1 ; add Y[i] and a*X[i]
SV V4,Y(R3) ; store the vector into array Y
R1 := R1+N
R2 := R2+N
end do
The classic vector pipeline machine is the Cray 1, and there are many later designs from Cray and others (eg the NEC
SX-6 used in the Earth Simulator). The main advantages of vector registers and vector instructions are that pipelined
floating point arithmetic can be organised very efficiently, and the maximum throughput of a highly-interleaved memory
system can be exploited.
We will study vector processors in more detail shortly. No further details are needed for this exercise.
This exercise concerns Amdahl's Law, which is discussed at length in the textbook:
Assume that we are considering accelerating a machine by adding a vector mode to it. When a computation is run in
vector mode, it is 20 times faster than the normal mode of execution. We call the percentage of time that could be spent
using vector mode the percentage of vectorisation.
(each part should take only a few minutes; you do not really need a calculator to uncover the basic point of this exercise
- back-of-the-envelope estimates are quite adequate).
So
2. With the original vector mode design, 70% vectorisation yields a net speedup of . Increasing
To achieve this net speedup by improving the compiler, we must increase the percentage vectorisation. We have
Exercise 1.2
1. To compute the speedup obtained from the fast mode we must work out the execution time without the
enhancement. We know that the accelerated execution time consisted of two halves: the unaccelerated phase
(50%) and the accelerated phase (50%).
Without the enhancement, the unaccelerated phase would have taken just as long (50%), but the accelerated phase
would take 10 times as long, i.e. 500%. So the relative execution time without the enhancement would be
.
t i ti i