A Common Bus Is Used For Data As Well As Instructions. The System Can Become Bus Bound'
A Common Bus Is Used For Data As Well As Instructions. The System Can Become Bus Bound'
A Common Bus Is Used For Data As Well As Instructions. The System Can Become Bus Bound'
State
State
State
State
Address
From PC Req. Instr. Recv. Instr.
ROM address
Fetch Op Code (ROM)
ROM Address
ROM data
Instruction
RAM Address
RAM data
.
Constant
Fetch Op Code (ROM)
ROM Address
ROM data Fetch variable (RAM)
RAM Address
RAM data Fetch constant (ROM)
Data
0 1 2 3 4 5 6 7 8 9 10
ROM 0 0
RAM 0 0
ALU 0
0 1 2 3 4 5 6 7 8 9 10
ROM 0 0 1 1 2 2
RAM 0 0 1 1 2 2
ALU 0 1 2
0 1 2 3 4 5 6 7 8 9 10
ROM 0 0 1 1 2 2
RAM 0 0 1 1 2 2
ALU 0 1 2
0 1 2 3 4 5 6 7 8 9 10
ROM 0 1 0 1 2 3 2 3 4 5 4
RAM 0 1 0 1 2 3 2 3 4 5
ALU 0 1 2 3 4
In this example, the row vector for ROM is {0,1}, for RAM is
{1,3} and for ALU is {2}.
X X X X 0,2,5,6
D
0 1 2 3 4 5 6 7
R X X X X 1,3,4,6
D X X X X 1,3,6,7
R X X X X 1,3,4,6
If R(i) < D(i)
Add D(i) - R(i) delays to all D X X X X 1,3,6,7
members of R at position i 0 1 2 3 4 5 6 7 8
and beyond.
X X X X 1,3,6,8
X X X X 1,3,6,7
Peridicity p = 2
0 1 2 3 4 5 6 7
If D(i) < R(i)
R X X X X 1,3,4,6
(for Example, p = 2
D X X X X 1,2,5,6
R = {1,3,4,6}, D = {1,2,5,6}.
Break here and
Now D2 < R2) move forward by p (=2) steps
0 1 2 3 4 5 6 7 8 9
1 Add sufficient multiples of p to
R X X X X 1,3,4,6
D(i) such that it is ≥ R(i).
2 Add the same number to X X X X 1,4,7,8
D
members of D beyond i. Now align R
3 Now if R(i) < D(i), add D(i) - 0 1 2 3 4 5 6 7 8 9
R(i) delays to all members of R X X X X 1,4,5,7
R at position i and beyond.
X X X X 1,4,7,8
D
0 1 2 3 4 5 6 7 8 9
10 Move R3 and beyond
D X X X X forward by 2
R X X X X So R = 1,4,7,9
X X X X and D = 1,4,7,8.
0 1 2 3 4 5 6 7 8 9 10 D4 < R4
R X X X X Move D4 forward by 2
to 10.
D X X X X
Now R4 < D4.
D X X X X
Move R4 forward by 1
R X X X X
to 10
Vectors are now aligned at 1,4,7,10.
Since the ROM and the RAM are used for 2 cycles each in
every operation, MASP = 2.
However, as we had seen before, ASP = 3 in this case.
Therefore, the schedule needs improvement.
0 1 2 3
ROM 0 0
RAM 0 0
ALU 0
So no alignment is required.
One can trade off power for speed when designing the
ALU.
By using optimization techniques, we are able to reach a
higher throughput, even with a slower ALU!