Assg1 Sol PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

CPS3340 - Computer Architecture

Assignment 1: Computer Performance

Solution
1. Conversion of number representations. Show the conversion procedure.
a. Convert the following decimal numbers to binary, octal, and hexadecimal. (10 pts)
174, 3781
Answer: 10101110b = 2568 = AE16
Answer: 111011000101b = 73058 = EC516
b. Convert the following numbers to decimal. (10 pts)
10001011b, 2348, FEA16
100010112 = 1x27 + 1x23 + 1x21 + 1x20 = 139
2348 = 2x82 + 3x81 + 4x80 = 156
FEA16 = 15x162 + 14x161 + 10x160 = 4074
2. Consider the following two processors P1 and P2 executing the same instruction set with the
clock rates and CPIs specified in the following table.
Processors P1
P2
Clock rate 2GHz 3GHz
CPI
1.0
2.5
a) Which processor has the highest performance? How much faster than the other two
processors? (10 pts)
P1: Cycle time 500ps, CPU time: IC*500ps*1.0=500ps*IC
P2: Cycle time 333ps, CPU time: IC*333ps*2.5=832.5ps*IC
P1 is faster
How much faster ?
Performance(P1)/Performance(P2)=832.5/500=1.665
b) If the processors each execute a program in 100 seconds, find the number of cycles and
the number of instructions for each processor. (10 pts)
# of cycles
P1: 100s/500x10-12s=2x1011
P2: 100s/333x10-12s=3x1011

# of instructions
P1: 2x1011/1.0=2x1011
P2: 3x1011/2.5=1.2x1011
c) For processor P2, we are trying to reduce the time by 40% but this leads to an increase of
20% in CPI, what clock rate should we have to achieve this time reduction? (10 pts)
CPU time = IC*832.5*(1-40%) = IC*Cycletime*1.2*2.5
Solving this equation,
Cycle time =166.5ps
Clock rate=1/166.5ps=6.006GHz
3. Consider two different implementations of the same ISA. There are four classes of
instructions, Arithmetic, Store, Load, and Branch. The clock rate and CPI of each
implementation are given in the following table.
Clock rate CPI Arithmetic CPI Store CPI Load CPI Branch
P1 2.0 GHz
1
2
3
4
P2 2.5 GHz
2
2
2
2
a) Given a program with 106 instructions divided into classes as follows: 10% Arithmetic,
20% Store, 50% Load, and 20% Branch, which implementation is faster? (10 pts)
P1: Cycle Time= 1/2.0x109 = 500 ps
106 * (1*10%+2*20%+3*50%+4*20%)*500x10-12 s= 1.4x10-3 s
P2: Cycle time= 1/2.5x109 = 400 ps
106 * (2*10%+2*20%+2*50%+2*20%)*400x10-12 s= 0.8x10-3 s
P2 is faster
b) What is the global CPI for each implementation? (10 pts)
P1:
total number of clock cycles: 106 * (1*10%+2*20%+3*50%+4*20%)
total number of instructions: 106
global CPI = 2.8
P2:
total number of clock cycles: 106 * (2*10%+2*20%+2*50%+2*20%)
total number of instructions: 106
global CPI = 2.0
a) For P1, if we can improve the performance of the Branch instructions by reducing its CPI
by half using a branch predictor, what is the speed up of the program? (10 pts)

106 * (1*10%+2*20%+3*50%+4*20%)*500x10-12 s/106 * (1*10%+2*20%+3*50%+2*20%)*500x10-12 s


=1.17

4. This exercise will explore the impact of compilers on execution time.


a) Consider a program compiled using compilers A and B running on the same processor.
Find out the average CPI for the two executables compiled by compilers A and B given
that the processor has a clock cycle time of 1ns. (10 pts)
Compiler A
Compiler B
# of Instructions Program Execution Time # of Instructions Program Execution Time
1.00E+09
1s
1.40E+09
1.6s
A: 1s/1ns=109 cycles
CPI = 109 cycles/109 instructions =1.0
B: 1.6s/1ns=1.6x109 cycles
CPI = 1.6x109 cycles/1.4x109 instructions = 1.14

b) Consider the program compiled by compiler A running on Processor PA and the program
compiled by compiler B running on Processor PB. Assuming the number of instructions
executed in a certain program is divided equally among the classes of instructions of
Arithmetic, Store, Load, and Branch, what is the CPU time of these two executions?
Which program is faster? (The number of instructions of the program compiled by
compilers A and B are given in the previous table) (10 pts)
Processor Clock rate CPI Arithmetic CPI Store CPI Load CPI Branch
PA
2.0 GHz
1
2
3
8
PB
2.0 GHz
2
2
2
2
A: 1.0*109 * (25%*1+25%*2+25%*3+25%*8) /2.0*109 = 1.75s
B: 1.6*109 * (25%*2+25%*2+25%*2+25%*2) /2.0*109 = 1.60s
B is faster

You might also like