Final Vls I Project Report

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Synchronous 16x8 SRAM Design

Bhavya Daya, Shu Jiang, Piotr Nowak, Jaffer Sharief


Electrical Engineering Department, University of Florida

AbstractMemory arrays are an essential building block in any
digital system. The aspects of designing an SRAM are very vital
to designing other digital circuits as well. The majority of space
taken in an integrated circuit is the memory. SRAM design
consists of key considerations, such as increased speed and
reduced layout area. The hope for this project was to be able to
create an efficient and compact SRAM. Due to time limitations,
the goal was to create a working SRAM design and to learn how
the SRAM functions. Design choices were made and justified
appropriately.
I. DESIGN OVERVIEW
The high level SRAM design, as shown in Figure 1, was divided into
four parts. These four parts were distributed among the four
members of the team. The work distribution is as follows:

1. SRAM cell design and analysis Shu Jiang
2. Row decoder and wordline driver Bhavya Daya
3. Column decoder and column circuitry Jaffer Sharief
4. Precharge circuitry and sense amplifier Piotr Nowak

The integration of the components and testing was performed by the
entire team. Recommendations and comments were given to each
person on his/her part of the project.

While designing, time limitations and ease of layout were the factors
being considered. The design contains a lot of room for optimization
and therefore isnt an optimal SRAM design. The goal was,
foremost, to establish a working SRAM design and if time permits to
adjust the design for speed and space optimizations.

The design process and choices were emphasized. Since many
aspects of proper SRAM design was learned after beginning the
project, the design choices made would have been adjusted if time
allowed.

Figure1:HighLevelSRAMBlockDiagram
II. SRAM 6T CELL DESIGN
The SRAM design consisted of sizing the transistors and
determining the read and write stability. The layout was performed to
create as compact a cell as possible. The capacitances added to the
wordlines and bitlines per SRAM cell were estimated.

6T SRAM CELL SIZING

Figure 2: 6T SRAM Cell

To ensure read stability of the 6T cell shown above in Figure 2, the
voltage (v
Q
) across N
1
should be less than the threshold voltage
(v
Tn
u.4v) when the charge on BI

is discharged
through N
1
anu N
5
. Intuitively, read stability can be met by choose
the size of N
1
to be greater N
5
. The exact size of N
1
can be can be
determined from the cell ratio (CR), where

CR =
W
1
L
1
W
5
L
5
oi CR =
W
1
W
5
when L is fixeu

As shown in [2], CR has to be greater than 1.2 to ensure read
stability. A CR value of 1.5 is chosen for the design of 6T cell.
Simulation showed CR = 1.S gives a v
Q

max
= u.SSuI < I
1n
.

To ensure write stability, the voltage (v
Q
) across N
6
should be less
than the threshold voltage (v
Tn
u.4v) when BL is pulled low to
write a 0 into the 6T cell. Similarly to read stability, the exact size
of N
6
can be can be determined from the pull-up ratio (PR), where

PR =
W
4
L
4
W
6
L
6
oi CR =
W
4
W
6
when L is fixeu

As shown in [2], PR has to be at least less than 1.8 to ensure read
stability. A PR value of 1 is chosen for the design of 6T cell.
Simulation showed PR = 1 gives a v
Qmax
= u.22v < v
Tn
.

The end result of transistor sizing after stability analysis is shown
below:

W
M2
= W
M4
= W
M5
= W
M6
= minimum layout wiuth = u.48um
W
M1
= W
M3
= 1.SW
M5
= u.72um

Figures 3 to 5 shows the Hold, Read, and Write stabilities (static


noise margins, SNMs) of the 6T cell designed with above transistor
sizes. The SNMs shown below confirmed the design choices made
following analysis above.

Figure 3: Hold Stability

Figure 4: Read Stability


Figure 5: Write Stability



GATE AND DIFFUSION CAPACITANCE

Gate capacitance of the pass transistors (N
5
anu N
6
) adds
capacitance to wordlines, while their diffusion capacitance adds
capacitance to bitlines. In order to estimate how much capacitance
each 6T SRAM cell contributes to wordlines and bitlines, the gate
capacitance and diffusion capacitance of each transistor need to be
determined first.

Gate capacitance(C
g
):
C
g
= C
gc
+C
ovcrIap

C
g
= WLC
ox
+ 2wC
0

C
g
= u.48um- u.24um-
6fF
um
2
+2 - u.24um-
u.S1fF
um
= u.84fF
Diffusion capacitance(C
dIII
):
C
dIII
= C
bottom
+C
sw

C
dIII
= C
j
WL
s
+ C
jw
(2L
s
+W)
C
dIII
=
2fF
um
2
- u.S6um- u.6um+
u.28fF
um
- (2 - u.6um+u.48um) = u.9ufF
Each SRAM cell contains two pass transistors. As a result, the
capacitance added to wordlines per cell is 2C
g
(1.68fF). The
capacitance added to bitlines per cell is C
dIII
(u.9ufF).

6T SRAM CELL LAYOUT


Figure 6: 6T SRAM Cell Layout

Cell Aiea = S.28x7.26 um = 44x6u.S

III. ROW DECODER AND WORDLINE DRIVER


For a n:2^n decoder, 2^n n input gates need to be built. Large fan-in
gates are the result of large decoders. A series stack is formed by
large fan-in gates and the decoder is slower. When the number of
inputs is greater than four, the decoder becomes very slow and large
gate have to be broken into smaller gates. Pre-decoding causes
common gates to be factored out and it saves area and is the same
path effort as a decoder without the pre-decoding step. A pre-decoder
was used in the design to reduce the fan-in and the size of the NAND
gate needed to create the decoder. This results in a faster decoder.

NAND and NOR gates were considered for the decoder design.
Using delay analysis, it was found that the NAND gate is faster than
the NOR gate. When designing the decoder, it was assumed that the
inverters will have to be large along the entire decoder in order to
drive the large wordline capacitances. Inverter sizing was learned
towards the end of the project time period and no time was available
to make the necessary adjustments.

The decoders are controlled by a clock because the SRAM is
synchronous. The positive edge of a clock will allow the address to
be read into the decoders, both column and row, and enable the

cor
latc
is
wh
the
dec
out
Inc
can
red
A
wo
ins
8:4
cap


Aft
freq
be
the
is s
out
the
rep


A n
ma
tran
mo

Als
gat
rrect wordline a
ch because the l
high. The regis
henever the clock
e clock can be a
code the address
tput can be co
cluding the clock
n lead to a larg
duction, it is mor
register was als
ordline driver bl
tead an ordinary
4 (8 is the pmos
pacitance.
fter performing a
quency was fou
faster if the inv
e entire decoder.
shown in Figure
tput to the decod
e row decoder. F
placed with Meta

IV. COLUM
normal decoder
ain problem is th
nsistors leading
ore time-consum
so the capacitan
te input count w
P
Inputs
and bitline. A r
latch reacts to th
ster, positive ed
k goes from low
an input into th
s when the cloc
onnected to a d
k in the decoder
ge skew. For ea
re appealing tha
so designed and
lock diagram. A
y NAND was us
and 4 the nmos
Figure 7: Wo
a simulation of
und to be 400 M
verter sizes were
The layout and
8. Poly lines we
der input. Metal2
Figure 8 also sho
al2.
Figure 8: Decode

MN DECODER A
built using log
hat the decoder w
to layout which
ming.
ce associated wi
ill add to long d
POLY
egister is consid
he input whenev
dge-triggered, r
to high. Instead
he decoder. The
ck input is high.
driver that driv
r is not a good a
ase of design an
an creating regist
d considered. Fi
A dynamic NAN
sed with a large
s) was used to d
ordline Driver
the row decoder
MHz. The decod
e changed to be
block diagram o
ere used to conne
2 could have bee
ows the poly lin
er Block Diagram
AND COLUMN C
gic gates has som
will require a ve
h is unnecessari
ith the long runs
delays. The addre
POLY
dered instead o
ver the clock sig
eacts to the inp
d of using a regis
decoder will on
. The row decod
ves the wordlin
approach because
nd layout and a
ters for each inp
igure 7 shows
ND was not us
inverter of the s
drive the large lo
r, the lowest clo
der would probab
incremental alo
of the row decod
ect the pre-decod
en used to speed
nes and that can

IRCUITRY
me drawbacks,
ry large number
ly complex and
s of wires and hi
ess inputs will a
Outp
3
f a
nal
put
ster
nly
der
nes.
e it
area
put.
the
ed,
size
oad
ock
bly
ong
der
der
up
be
the
r of
so
igh
also
have to
problem
very hig
problem
reduces
of the de

The NO
charge
input is
outputs
be read
pre-char
will stil
capacito
correspo
capacito
one line
will be
transisto
data line
return th
Enable)

The sel
we will
timing i
charge c
data is p
the pre-
simple
interface


We hav
hence th
uts
be buffered to
m is that the pow
gh due to the
ms, we have use
the number of t
ecoder and make
OR decoder wor
cycle followed
s asserted which
of the decoder g
at this instant, s
rge is done, the
ll be at Logic
ors. Now the inp
onding NMOS
or on that line w
e will go to grou
the decoded li
ors on that line.
e. This data will
he correct data
signal is asserte
lected line will
l need a series
is a very impor
cycle, all the ou
present at the lin
-charge input is
two input AND
es this with the c
F
ve to make a tra
he maximum sp
o drive this hug
wer consumptio
large number o
ed a dynamic N
transistors by ha
es the layout sim
rks like a dynam
by the evaluat
h turns ON the
go to Vdd (Logic
since all the outp
e PMOS device
1, because the
puts are applied o
devices will be
will be discharge
und (Logic 0).
ine and this lin
Thus the right
l be given to the
to the outside
ed.
have to drive a
of buffers to d
rtant parameter
utputs will go h
ne only after the
de-asserted (ma
D gate. Followe
control circuitry
Figure 9: Column C
ade-off between
eed of the circu
ge capacitance l
on of such a de
of gates. To ov
NOR decoder. T
alf. It also increa
mple and less tim
mic circuit. It re
tion stage. The
e PMOS device
c 1). The decod
puts will be at V
is turned OFF
charge will be
on the three addr
turned ON and
ed to ground. Th
The line, which
ne will drive al
data will be av
output data bloc
world when the
number of tran
drive such a hea
here because du
high. However,
e precharge cyc
ade 1). This is
ed by a Mux C
.
Circuitry
the decoder de
uit) and the size
load. Another
ecoder will be
vercome these
This structure
ases the speed
me-consuming.
equires a pre-
e Pre-charge
es and all the
der should not
Vdd. Once the
. The outputs
stored on the
ress lines. The
d the charged
hus all, except
h remains high
ll the NMOS
vailable on the
ck, which will
e OE (Output
nsistors and so
avy load. The
uring the pre-
the necessary
le is over and
done using a
Circuit which

elay time (and


of the PMOS

tran
fas
fea
eig
Th
the
thir
vol
out
tran
fini
eno

Th
pro
sm
sen
con
in o
Att
larg
nsistors used fo
ter is the pre-c
asible to fabricat
ght of them.
V. PRECHA
e precharge circ
e phi2_b clock to
rd transistor is p
ltage in case of
t the circuit, the
nsistors in cas
ishing simulatio
ough with the tra
Fi
e sense amplif
oduce regenerati
mall differential
nse_clk derived
nnected to the bi
order to allow re
tached to the sen
ge write transist
or pre-charging.
charging and so
te large size PM
ARGE CIRCUITR
cuit uses two sm
o precharge the
placed connectin
any mismatch b
e design left roo
e they were n
ons, the precha
ansistor sizes use
igure 10: Precharg
fier uses a pai
ive feedback to
voltage on the
d from phi1 an
itlines through la
eading from the
Figure 11: Sen
nse amplifier are
tors and their con
Larger the PM
faster is the d
OS transistors, s
RY AND SENSE A
mall pmos transi
bitlines whenev
ng the bitlines to
between the bitl
om for expandin
not fast enough
arge circuit wa
ed, and was left
ge Circuit Schemat
ir of cross-coup
achieve a quick
e bitlines. It is
nd RD. The s
arge transistors w
SRAM.

nse Amplifier
e its control circu
ntrol circuits, as
MOS transistor,
decoder. Here it
since we have on
AMPLIFIER
stors controlled
ver phi2 is high
o help equalize
lines. When layi
ng the prechargi
h. However, af
as more than f
alone.

tic
pled invertrers
k output based on
s controlled by
sense amplifier
with low resistan

uitry as well as
s well as transist
4
the
t is
nly
by
. A
the
ing
ing
fter
fast
to
n a
y a
is
nce
the
tors
that pre
preventi
The tes
correspo
capacita
The add
limiting
circuit a
version
minimum

Overall
Read Ac
Write A
Power d
Energy
The SR
depth un
knowled
any inte
often in
struggle

[1] N. W
Systems P

[2] J. Rab
Desig

echarge the pass
ing them from af
VI. SRAM
st measurements
onding to 33
ances from the l
dress to wordli
g the SRAM per
also dominates
like the column
m clock period.
SRAM Area:
ccess Time:
Access Time:
dissipation:
Delay Product:
RAM design proj
nderstanding of
dge obtained by
egrated circuit (
n the developm
es assisted in und
Weste, D. Harris,
Perspective.
baey, A. Chandrak
gn Perspective.
s transistors used
ffecting the sens
M INTEGRATION
s were taken a
33MHz. Befor
ayout, the circu
ne delay seems
rformance. The
the total area. R
n decoder would
97um x 12
.27ns (1),
.80ns (1),
3.38mW
3.04 x 10
-

Figure 12: Full L
VII. CONCLUS
ject required ma
how the SRAM
y implementing
(IC) design. Th
ment of ICs. Th
derstanding the d
REFERENCE
A. Banerjee, CM
kasan, B. Nikolic,
d by the column
se amplifier.
N AND RESULTS
at a clock perio
re extraction
it was functiona
s to be the del
row decoder/w
Replacing it wi
d likely reduce
22um
1.51ns (0)
.62ns (0)
-20
Js
Layout
SION
any design choic
M functions. The
the SRAM wil
he Cadence tool
he design consi
design and testin
ES
MOS VLSI Design:
Digital Integrated
n multiplexer,
od of T=3ns,
of parasitic
al at 400MHz.
lay ultimately
ordline driver
ith a dynamic
both area and

ces and an in-


e experimental
l be useful in
is used very
derations and
ng process.
:A Circuits and
d Circuits:A
5



Bhavya Daya is pursuing BSEE, BSCEN and
M.Eng degrees at the University of Florida. She is
considering Ph.D. as the next step in her academic
development. Her research interests include Parallel
Computer Architecture, Wireless Communications
and VLSI/RF Technology and Design.





Shu Jiang is currently pursuing the BS and MS in
Electrical Engineering at the University of Florida.
His research interests include robotics, prosthesis,
RF, and VLSI IC design. Mr. Jiang is planning to
pursue a PhD in one of his interested research areas.






Piotr Nowak received his BS in electrical and
computer engineering from the University of
Florida in 2008 and is currently working towards
an MS. His academic interests include digital and
analog circuits, as well as computer architecture
and devices.




Jaffer Sharief received his BE in Electronics and
Communications from University Visvesvaraya
College of Engineering, India in 2005 and has
worked with Robert Bosch in Automotive
Electronics in the field of Active and Passive
Safety Systems in cars and is currently working
towards a MS. His research interests include VLSI,
MEMs, Embedded Systems. Mr. Sharief also plans
to pursue a PhD in one of the above.

You might also like