Lectures wk7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Latch-Based Sense Amplifier

EQ
BL BL
VDD

SE

SE

In itia lized in its m eta -stab le p o in t w ith E Q


O n ce ad eq u a te vo ltage ga p crea ted , sen se am p en ab led w ith S E
P ositive feed b a ck q u ick ly fo rces o u tp u t to a sta b le op eratin g p o in t.
114
Sense Amplifier
bit’ bit
word

sense
clk isolation transistor

regenerative amplifier

115
Clocked Sense Amp

• Clocked sense amp saves power


• Requires sense_clk after enough bitline swing
• Isolation transistors cut off large bitline capacitance

bit bit_b

sense_clk isolation
transistors

regenerative
feedback

sense sense_b
116
Alpha Differential Amplifier/Latch
S3 column
S2
S1 decoder
S0

mux_out

!mux_out
precharge
0->1
P1 P2
sense
amplifier
N3 N5

sense
N2 N4

sense N1
0->1 off->on V = Vdd
P3 P4
Pre -> Closed)

!sense_out
sense_out

117
Sense Amp Waveforms
1ns / div

bit
200mV
bit’

wordline wordline
begin precharging bit lines

BIT
2.5V
BIT’

sense clk sense clk


118
119
120
Sense Amp (Caveat)

• For SRAM, one S/A for 4-8 columns


• after column mux
• For DRAM, one S/A for every column
• for refresh!
• Need to precharge S/A before opening
isolation transistors
• To avoid discharging bit lines
• Requires 3 timing phases
• Typically self-timed

121
Write Driver Circuits

122
Twisted Bitlines
• Sense amplifiers also amplify noise
• Coupling noise is severe in modern processes
• Try to couple equally onto bit and bit_b
• Done by twisting bitlines
b0 b0_b b1 b1_b b2 b2_b b3 b3_b

123
Transposed-Bitline Architecture

BL’
Ccross
BL
SA
BL
BL"

(a) Straightforward bitline routing.

BL’
Ccross
BL
SA
BL
BL"

(b) Transposed bitline architecture. 124


2

125
Review of Memory Hierarchies

CPU
Increasing Capacity

Cache
Physical Memory

Increasing Speed
(SRAM)

Main Memory
(DRAM)

Virtual Memory
(Hard Disk)

126
Who Cares About the Memory Hierarchy?

Processor-DRAM Memory Gap (latency)

µProc
1000 CPU
60%/yr.
“Moore’s Law” (2X/1.5yr)
Performance

Processor-Memory
100 Performance Gap:
(grows 50% / year)
10 DRAM
DRAM
9%/yr.
(2X/10 yrs)
1
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
Time 127
Cache Memory Motivation

• Processor speeds are increasing much faster than memory speeds


• Memory speed matters
• Each instruction / data needs to be fetched from memory
• Loads, stores are a significant fraction of instructions
• Amdahl’s Law tells us that increasing processor performance without
speeding up memory won’t help much overall
• Locality of references
• Locations that are close together in the address space tend to get
referenced close together in time (spatial locality)
• Tend to reference the same memory locations over and over again
(temporal locality)

128
Memory Reference Patterns
Bad locality behavior
Memory Address (one dot per access)

Temporal
Locality

Spatial
Locality
Time
Donald J. Hatfield, Jeanette Gerald: Program
129
Restructuring for Virtual Memory. IBM Systems Journal
10(3): 168-192 (1971)
Caches

Caches exploit both types of predictability:

– Exploit temporal locality by remembering the contents of


recently accessed locations.

– Exploit spatial locality by fetching blocks of data around


recently accessed locations.

130
2/8/2024 CS252-Fall’07 130
Cache Memories
• Relatively small SRAM memories located physically
close to the processor
• SRAMs have low access times
• Physical proximity reduces wire delay

• Similar in concept to virtual memory


• Keep commonly-accessed data in smaller, fast memory
• Inclusion property
• Any data in a given level of the hierarchy must be contained
in all levels below that in the hierarchy.
• Means that we never have to worry about whether we have
space for data we need to evict from one level into a lower level

131
Where can a block be placed ?
• Block 12 placed in 8 block cache:
• Fully associative, direct mapped, 2-way set associative
• S.A. Mapping = Block Number Modulo Number Sets

Fully associative: Set associative:


block 12 can go Direct mapped: block 12 can go
anywhere block 12 can go anywhere in set 0
only into block 4 (12 mod 4)
(12 mod 8)
Block 01234567 Block 01234567
no. Block 01234567 no.
no.

Set Set Set Set


Block-frame address 0 1 2 3

Block 1111111111222222222233
no. 132
01234567890123456789012345678901
Data Address is used to organise cache storage strategy

• Word is organised by byte bits


• Block is organised by bits denoting the
word
• Location in cache is indexed by row Word address bits fields
• Tag is identification of a block in a
cache row

Tag
Index
Block
Byte

133

You might also like