Eecs112 hw1
Eecs112 hw1
Eecs112 hw1
Homework 1
Due: April 15, 2022, 11:59PM on Canvas
1. (30 points) Consider the following two processors P1 and P2 and a given task T .
P1: clock rate of 4 GHz, average CPI of 0.75, and requires 5 × 109 instructions to run T .
P2: clock rate of 3 GHz, average CPI of 0.9, and requires 1 × 109 instructions to run T .
(a) (10 points) One common fallacy is to consider the computer with the highest clock rate
as having the highest performance. Check if this is true for P1 and P2.
(b) (10 points) Another fallacy is to consider that the processor executing the largest number
of instructions will need a larger CPU time. Assume that processor P1 is executing
a sequence of 109 instructions and that the CPI values of processors P1 and P2 when
executing these instructions do not change, determine the number of instructions that P2
can execute in the same time that P1 needs to execute 109 instructions.
(c) (10 points) Another common performance measure is GFLOPS (giga (109 ) floating-point
operations per second). Assume that 40% of the instructions executed on both P1 and
P2 are floating-point instructions. Calculate and compare the GFLOPS measure for both
processors. Does higher GFLOPS mean better overall performance?
2. (30 points) Assume we have a processor P1 running at 2 GHz clock rate. Assume a program
requires the execution of 50 × 106 FP (Floating Point) instructions, 80 × 106 INT (integer)
instructions, 110 × 106 L/S (Load/Store) instructions, and 16 × 106 branch instructions. The
CPI for each type of instruction is 1, 1, 4, 2, respectively.
(a) (5 points) What is the average CPI for this program on this processor P1 ? What is the
CPU time for P1 to run this program?
(b) (5 points) By how much must we improve the CPI of FP (Floating Point) instructions if
we want the program to run two times faster?
(c) (5 points) By how much must we improve the CPI of L/S (Load/Store) instructions if we
want the program to run two times faster?
(d) (5 points) By how much is the execution time of the program improved if the CPI of INT
(integer) and FP (Floating Point) instructions are reduced by 40% and the CPI of L/S
(Load/Store) and Branch is reduced by 30%?
(e) (10 points) Assume we have another proccessor P2 that has the same ISA as P1 , running at
2.5 GHz. The CPI for each type of instruction is 3, 3, 3, 2, respectively. Which processor
is faster in running this program? What is the speedup over the other processor?
3. (10 points) Assume that we are considering enhancing a machine by adding a vector mode to
it. When a computation is performed in vector mode, it is 20 times faster than the normal
mode of execution. We call percentage of time that the machine spends using vector mode the
percentage of vectorization.
(a) (5 points) What percentage of vectorization is needed to achieve one-half of the maximum
speedup attainable from using vector mode?
(b) (5 points) Suppose the percentage of vectorization for a program is 60%. The hardware de-
sign group could double the speed of vector mode with a significant additional engineering
investment. The compiler crew could increase the use of vector mode as another approach
to increasing performance. How much of an increase in the percentage of vectorization
(relative to current usage) would the compiler team need to obtain the same performance
gain?
4. (20 points) Assume we have a processor design that runs at clock rate of 3.6 GHz, and voltage
of 1.25 V. On average, it consumed 10 W of static power and 90 W of dynamic power.
(a) (10 points) We are trying to improve the performance of this processor by increasing the
clock rate to 5 GHz. To avoid processor overheat, we reduce the voltage so that the
dynamic power keeps the same. What is the new voltage? (Assume that the capacitive
load of processor does not change)
(b) (10 points) If the total dissipated power is to be reduced by 10%, how much should the
voltage be reduced to maintain the same leakage current? (Note: power is defined as the
product of voltage and current)
5. (10 points) The result of the SPEC CPU2006 bzip2 benchmark running on an AMD Barcelona
has an instruction count of 2.389E12, an execution time of 750 s, and a reference time of 9650s.
(a) (5 points) Find the CPI if the clock cycle time is 0.333 ns.
(b) (5 points) Find the SPECratio.