WHKETaz Wa FPFT XBX
WHKETaz Wa FPFT XBX
WHKETaz Wa FPFT XBX
Architecture
training@mindshare.com
1-800-633-1440
¾ In-House classroom
¾ Virtual classroom
¾ eLearning Courses
www.mindshare.com
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Register on MindShare’s Website 8
www.mindshare.com
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DRAM Topics 9
2007
DRAM
Revenue
Maker
$ billions
2007
Samsung 8.7 Logic
Revenue
Maker
Hynix 6.7 $ billions
Qimonda 4.0 Intel 39.2
Elpida 3.8 AMD 6.3
Micron 3.2
Nanya 1.6
Powerchip 1.4
So who has the greatest responsibility
ProMos 1.1 for new features such as fly-by?
Etron 0.4 Follow the money.
Elite 0.2
Others 0.4
Total 31.5
Main Main
Memory CPU CPU Memory
DRAM
DRAM
DDR3
Cntlr
Cntlr
DDR3
QPI
QPI
DDR3 QPI QPI DDR3
QPI QPI
QPI QPI
PCIe
PCIe
QPI to PCIe
ESI
ESI
SMBus
PCIe ICH
PCI
IDE
LAN USB
AC’97
LPC
Keybrd BIOS
Floppy Super
IO Com1,2
Mouse
Min Huang(min.huang@ lecroy.com) Printer
Do Not Distribute .com © 2009
DRAM Feature Summary
http://www.jedec.org
¾ What is RAM?
¾ Random Access Memory.
¾ Why is it called Random access memory?
¾ Previously there were other kinds of memory
that were sequentially accessed. These other
kinds of memory were usually in the form of
magnetic tape or drum. To my knowledge the
term SAM was never coined.
¾ SRAM Cell
¾ A SRAM cell is composed of many
transistors.
¾ The cell consists of two cross-coupled
CMOS inverters that store one bit of
information, and two N-type transistors that
connect the cell to the bitlines.
¾ To read the information, the word line is
activated while the external bit line drivers
are switched off. Therefore, the inverters
inside the SRAM cell drive the bitlines,
whose value can be read-out by external
logic.
¾ To write new data into the cell, the big
(external) tristate drivers are activated to
drive the bitlines. Next, the word line
transistors are enabled. Because the
external drivers are much bigger than the
small transistors used in the 6T SRAM cell,
they easily override the previous state of
the cross-coupled inverters.
To Sense Amp
Word Line N
Word Line N1
Word Line N2
Word Line N
Word Line N1
Word Line N2
Word Line N3
Bit Line Bit Line# Bit Line# Bit Line Bit Line Bit Line#
Min Huang(min.huang@
To Sense Amp lecroy.com)
Do Not Distribute .com © 2009
Sense Amplifier Architecture 39
ADD WE
Each time a DRAM location is read, the capacitor associated with that cell
is discharged. Before the device is read again, DRAM logic must recharge
these locations, which takes some time—called the precharge delay.
Top Of Chip
Memory Select Timing
There’s no handshake!
A[n:20]
Address Generator The DRAM controller has to know
Decoder EVERYTHING about the DRAM and it
has to micromanage the DRAM.
Select
Host Physical Address
A[19:10] RAS#
CAS#
WE# (Write Enable)
DRAM
Address
Memory Address Bus
Mux DRAM
A[9:0]
Data Bus e.g. 64-bit
DRAM Controller
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
8088 System Memory Bus 46
DRAM
Error System
Buffers
Handling Bus Controller
and Memory
Reset DRAM
Logic Controller
Clock
Logic Buffers Buffers
PC Expansion Slots
ADDR
Row Decoder (4096 Rows) ADDR ADDR ADDR ADDR ADDR
Sense Amplifier
CS DATA Bit lane
Column Decoder (1024 Columns)
2C 28 Intel’s toggle-mode
24 20 burst line fill address
sequence optimized
1C 18
DRAM performance
14 10 regardless of whether
C 8 DRAM was single-
bank 32-bit memory
4 0
(single SIMM
channel), dual-bank
32-bit memory, or 64-
MUX bit memory. The
sequence is shown in
the Commands and
Waveforms section of
32-bit 486 data bus this slide show.
Row Decoder
CAS DATA Bit lane
Sense Amplifier Sense Amplifier
Column Decoder Column Decoder
Row Decoder
Row Decoder
¾ Chips can be organized into data bus widths of 4, 8, 16, and 32.
¾ This was shown in the previous example as number of bits X (by)
number of data lines X (by) number of banks. The number of bits is
referred to as the density.
¾ Here is an exercise of how the same technology can yield different chip
organizations.
Refresh Bank 3
Counter Bank 2
CKE
13
Bank 1
Control
DM
CK Logic
Row
Command
CS# Bank 0
Decode
13
Address Bank 0 Data
WE# MUX Memory
CAS# Row Output 4
Address 8192
Array
RAS# 8192x4096 Registers
Latch & X4
Decoder
Mode Registers Sense Amps
16384 DQ 0-3
2
13
I/O Gating 4
DM Mask
Bank Logic
Control
12
4096
Logic X4
Data Input
2 Column Registers
4
Decoder
Column
Address 12
A0-12 15 12 Address
Registers
BA 0,1 Counter/
Latch
Purple
area is
Architecture of a 512 Mb DDR1 chip organized as 32M X 4 X 4 running on
both edges
Clk
Refresh Bank 3 of clock
Counter Bank 2 8 Read
DLL
CKE Latch
13
Bank 1
CK# Control
4 4
CK Logic
Row 4
Command
CS# Bank 0
Decode
13 MUX
Address Bank 0 Col 0
WE# MUX Memory
CAS# Row
Address 8192
Array
RAS# 8192x2048
Latch & X8 DQS 1
Drivers
Decoder Gen.
Mode Registers Sense Amps
16384 DQ 0-3
2
13
DQS
I/O Gating DQS
8
DM Mask
Bank Logic
Control Receivers DM
15
2048
Logic
X8
Input Registers
1 4
2 Column Write 2
Decoder FIFO
Column 8 8
Address 11 &
A0-12 15 12 Address
Registers Drivers
BA 0,1 Counter/ Clk
Latch
Col 0
Purple
area is
running on
Architecture of a 512 Mb DDR2 chip organized as 32M X 4 X 4
both edges
of clock
Refresh Bank 3
Counter Bank 2 16 Read Clk
DLL
CKE Latch
14
Bank 1
CK# Control
4 4 4 4
CK Logic
Row 4
Command
CS# Bank 0
Decode
14 MUX
Address Bank 0 Col 0,1
WE# MUX Memory
CAS# Row
Address 16384
Array
RAS# 16384x512
Latch & DQS
DQS DQS#
I/O Gating 16 DQS
DM Mask DQS#
Bank Logic
Control Receivers
512 DM
16
Logic X 16
Input Registers
1 4
2 Column Write
Decoder FIFO
Column 16
Address 9 &
A0-13 15 11 Address
Registers Drivers
BA 0-2 Counter/ Clk
Latch
Col 0,1
Row Decoder
Array Block
Row Address Path
Sense Amplifier
Column Decoder
Top View
E
standard and is referred to as VDDL VREF VSS VSSDL CK VDD
F
MO-207. CKE WE# RAS CK# ODT
G
¾ Ball-out depends on die H
BA2 BA0 BA1 CAS# CS#
A10 A1 A2 A0 VDD
organization. J
VSS A3 A5 A6 A4
K
A7 A9 A11 A9 VSS
L
VDD A12 RFU RFU A13
B
used for x16 DDR2 C
DQ14 VSSQ UDM UDQS VSSQ DQ15
devices. D
VDDQ DQ9 VDDQ VDDQ DQ8 VDDQ
F
shown in the On-DIMM G
DQ6 VSSQ LDM LDQS VSSQ DQ7
H
DQ4 VSSQ DQ3 DQ2 VSSQ DQ5
presentation. J
VDDL VREF VSS VSSDL CK VDD
K
CKE WE# RAS CK# ODT
L
BA2 BA0 BA1 CAS# CS#
M
A10 A1 A2 A0 VDD
N
VSS A3 A5 A6 A4
P
A7 A9 A11 A9 VSS
R
VDD A12 RFU RFU A13
¾ To speed the
Pentium and K5 to
market, the same
32-bit SIMMs were
used.
¾ Due to the Pentium
and K5’s 64-bit
memory data bus
two SIMMs were
used.
Pentium
Optional
430 HX/TX Chipset Processor L2 Cache
SRAM
w/ Graphics on PCI FSB
Fast-Page or
North Bridge
(Intel 430 TX/HX) EDO SIMMs
PCI Slots
PCI-33MHz
ISA
COM1
COM2
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Dual Inline Memory Module (DIMM) 72
Rank 1 Rank 2
Dual Rank x8
Rank 1
Rank 2
Single Rank x4
Rank 1
Single Rank x8
Rank 1
SPD
DRAMs
Substrate
Shown with
optional ECC ECC
chip. Typically
unbuffered
DIMMs do not
have ECC
support.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Unbuffered Logical Example 78
CS0#
BA0-2 CK0
A0-A13 CK0#
ODT SDA SPD SCL
RAS# CK1
CKE
CAS# CK1#
WE# SA0-2 WP/GND CK2
Min Huang(min.huang@
5.1 Ohms lecroy.com)
All resistors are 22 Ohm unless stated
CK2#
x8 8 Single 512MB
Registered
Server x8 16 Dual 1GB
Workstation
x4 16 Single 1GB
x4 32 Dual 2GB
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DIMM Configuration Unbuffered 80
Top View
Top View
Top View
SPD
DRAMs
Substrate
Register PLL Register Registers and
PLL
ECC
ECC
Top View
ECC
Top View
ECC
Register PLL Register
Top View
ECC
Top View
MiniDIMM:
SODIMM: Notebook standard Computing and Networking
VLP MiniDIMM:
Computing and Networking
ODT
¾ Input High
¾ On Die Termination enables internal termination resistors to
the following signals: DQ, DQS, DQS#, CB and Data Mask.
¾ Every Rank has its own ODT signal.
¾ ODT will be discussed more later
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Module Pin Description 91
CS[0-3]#
Address (A 0-15)
¾ Input High
¾ Defines the Row Address for Active commands
¾ Defines the Column Address and auto precharge bit for
read/write commands.
¾ A10 is Row Address during an Active (Activate) command.
¾ A10 is sampled during a Precharge command.
¾ A10 Low = Precharge one bank
¾ A10 High = Precharge all banks
¾ A10 is Auto Precharge (AP) during a Read or Write command.
¾ A10 Low = No Precharge
¾ A10 High lecroy.com)
Min Huang(min.huang@ = Auto Precharge when command is complete.
Do Not Distribute .com © 2009
DDR2 Module Pin Description 93
DIMM
DQ 0-3
DQ 0-3
DQS0
DQS x4 DRAM
DQS0#
DQS#
DQ 4-7
DQ 0-3
DQS9 RDQS balances the
DQS x4 DRAM
DQS9# load on DQS 9-17 for
Both DIMMs DQS# channels that have
on same DIMMs with x4
channel devices as well as
DIMMs with x8
DIMM
devices.
DQ 0-7
DQ 0-7
DQS0
DQS
DQS0#
DQS# x8 DRAM
DQS9
RDQS
DQS9#
RDQS#
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Module Pin Description 95
Min Huang(min.huang@
Power Down Exit L lecroy.com)
H L H H H X X X X
Do Not Distribute .com © 2009
DDR3 Command Additions 101
CKE A13
BA0- A12 A10 A0-9,
Command Previous Current CS# RAS# CAS# WE# A14
BA3 BC# AP A11
Cycle Cycle A15
Write BL= 4 (Burst H H L H L L BA CA L L CA
Chop)
Write BL= 8 H H L H L L BA CA H L CA
T0 T T T T T T T
CK#
CK
EMR w/ MR w/
A[13:0] A10 = 1 EMR(2) EMR(3) DLL
Enable
DLL
Reset
A10 = 1
DQS/DQS#
(Hi-Z)
DQ
200us (min) 400ns (min)
(Hi-Z) (Power-up,
VDD and stable CK)
T T T T T T
CK#
CK
MR w/o
A[13:0] DLL
Reset
A10 = 1
DQS/DQS#
(Hi-Z)
DQ Normal Operation
(Hi-Z) 200 CK clocks to Normal Operation
¾ There are typically 3 timings that are referred to for DIMM speed
besides the frequency
¾ CL - Column Address Strobe Latency is the amount of time in base
clocks from when CAS is asserted until data should be valid.
¾ DDR2 requires the clock interval to be in whole clocks.
¾ CAS latency is a function of the DRAMs internal speed. The
faster the DRAM the lower the CAS Latency.
¾ RCD – RAS-to-CAS Delay is the time in base clocks required from
an Activate to a Read or Write.
¾ RP – Time in base clocks required to precharge or write back a
row.
¾ Example: DIMM might say on it 4 – 4 – 4
CL – RCD – RP
T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#
CK
BA[2:0] BA 0 BA 1 BA 1
COMMAND ACTIVE NOP NOP ACTIVE NOP NOP READ NOP NOP
RRD=3
DQS/DQS# RCD=3
(from SDRAM)
DQ
(from SDRAM)
tRRD is the Minimum time interval from one Bank Activate to another Bank (RAS to RAS delay)
tRCD is the Minimum time from activate to a read or write command (RAS to CAS delay)
DDR2 400 3 - 3 - 3
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Read Command 108
T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#
CK
COMMAND ACTIVE NOP NOP READ NOP NOP NOP NOP NOP
Preamble
RCD=3
DQS/DQS#
(from SDRAM)
CL= 3 D D D D
DQ 0 1 6 7
RL= 3
(from SDRAM)
Burst Length = 8
CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
RL = Read Latency
Burst Length = 8
DDR2 400 3 - 3 - 3
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Read Burst with Additive Latency 114
T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#
CK
COMMAND ACTIVE READ NOP NOP NOP NOP NOP NOP NOP
Preamble
RCD=3
DQS/DQS#
(from SDRAM) AL=2
CL=3 D D D D
DQ 0 1 6 7
RL= 5
(from SDRAM)
Burst Length = 8
CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
RL = Read Latency
Burst Length = 8
CK
tRRD
CMD/ ACT ACT RD AP ACT RD AP RD AP
ADD B0, Rx B1, Rx B0, Cx B2, Rx B1, Cx B2, Cx
tRCD CAS Latency (CL) Read Data Read Data Read Data
Data
¾ The Read Command for Bank0 was sent during cycle T4.
¾ The memory controller would have liked to send the Activate Bank2 command
on cycle T4 instead of T5.
¾ Due to this delay in activating Bank2, it results in a gap in the data stream
being returned by the SDRAM because of this scheduling conflict.
CK
tRRD
CMD/ ACT RD AP ACT RD AP ACT RD AP
ADD B0, Rx B0, Cx B1, Rx B1, Cx B2, Rx B2, Cx
Additive Latency (AL) CAS Latency (CL) Read Data Read Data Read Data
Data
Read Latency (RL)
No Data Gap
¾ DDR2 SDRAM can queue commands and schedule them at the appropriate
time based on the programmed value of Additive Latency (AL).
¾ The memory controller no longer has a scheduling conflict because the
commands can be sent to SDRAM back-to-back and the SDRAM will schedule
them at the appropriate time.
¾ The data gap seen in the previous example no longer exists.
T0 T1 T2 T3 T4 T5 T6 T7 T8
CK#
CK
COMMAND NOP READ NOP READ NOP NOP NOP NOP NOP
Preamble
CL=3
DQS/DQS#
(from SDRAM)
D D D D D D D D
DQ 0 1 2 3 0 1 2 3
(from SDRAM)
Burst Length = 4 Burst Length = 4
CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
Burst Length = 4
T0 T1 T2 T3 T4 T5 T8 T9 T10
CK#
CK
COMMAND ACTIVE NOP NOP WRITE NOP NOP NOP NOP NOP
Preamble
RCD=3
DQS/DQS#
(from MC)
CL-1=2
D D D D
DQ and DM 0 1 6 7
(from MC)
Burst Length = 8
CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
WL = Write Latency
Burst Length = 8
T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#
CK
READ
COMMAND NOP NOP NOP NOP ACTIVE NOP NOP
AP
Preamble
CL=3
DQS/DQS#
(from SDRAM) RTP=2 RP=3
D D D D D D D D
DQ 0 1 2 3 4 5 6 7
(from SDRAM)
DDR2 400 3 - 3 - 3
tRP is the time required to internally precharge an active Row until the next command
tRTP is the time from Read to Precharge
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Types of Refresh and Some History 123
Refresh
Controller told async DRAM chips which row to refresh.
Auto Refresh
SDRAM feature. CAS-before-RAS refresh.
Row counters are inside the SDRAM chips.
Controller only tells chips when to refresh, not which row.
Simply called “refresh” now.
Self Refresh
System powered off; DRAM fully powered.
Refresh interval timer inside SDRAM chips.
Interval is fixed before starting self refresh.
Auto Self Refresh
Self refresh where DDR3 chips automatically choose their
refresh interval based on their temperature.
Longest
Min Huang(min.huang@ DDR3 tRFC is 350ns
lecroy.com)
Do Not Distribute .com © 2009
Refresh Waveforms 126
T0 T1 T2 T3 T4 T5 T6 TA TB
CK#
CK
CKE
A[13:0] A10 *1
COMMAND NOP PRECRG NOP NOP REF NOP NOP NOP NOP
RP=3
DQS/DQS#
(from SDRAM) tRFC=75nS
DQ
(from SDRAM)
*1 A10 must be high for more than 1 bank to get Precharged. Precharge all must be
done before entering the Refresh mode.
tRFC is the refresh cycle time
T0 T1 T2 T3 T4 T5 T6 TA TB
CK#
CK
CKE
A[13:0]
COMMAND NOP PRECRG NOP NOP REF NOP NOP NOP NOP
RP=3
DQS/DQS#
(from SDRAM)
DQ
(from SDRAM)
All banks must be precharged before entering the Self Refresh mode.
Refresh with CKE low will cause the DRAM to go into a Self Refresh state.
tRP is the time required to internally precharge an active Row until the next command
T0 T1 T2 T3 T4 T5 T6 TA TB
CK#
CK
CKE
A[13:0]
COMMAND NOP PRECRG NOP NOP NOP NOP NOP NOP NOP
RP=3
DQS/DQS#
tREFI X 8
(from SDRAM)
DQ
(from SDRAM)
Precharge Power Down shown. If any banks are left open, this becomes Active Power Down.
tREFI is Refresh interval time times the 8 posted refreshes until exit of Power Down is required.
¾ 1T and 2T timings
¾ Depending on how many DIMMs the system needs to
support, the DRAM controller may use one of 2 different
address and command timing schemes.
¾ 1T: The address and command signals are held active by
the controller for 1 clock. This allows for faster turnaround
times.
¾ 2T: The address and command signals are held active by
the controller for 2 clocks. For systems with more loads, this
allows for longer setup and hold times. Control signals (CS#,
CKE, ODT) must still obey 1T timing.
T0 T1 T2 T3 T4 T5 T6 T7 T11
CK#
CK
CS#
Preamble
RCD=3
DQS/DQS#
(from SDRAM) AL=1 CL=3
D D D
DQ 0 1 7
(from SDRAM)
Burst Length=8
¾ Step 1
¾ Apply power. Keep CKE below 0.2 x VDDQ and ODT
LOW. All other inputs may be undefined.
¾ Refer to the JESD79-2C standard for details about the
allowable voltage ramp times.
¾ Step 2
¾ Start clock and maintain stable condition with CKE
held low.
¾ Step 3
¾ For the minimum of 200 us after stable power and
stable clock (CK, CK#), then apply NOP or
Deselect & take CKE HIGH.
¾ Step 4
¾ Wait minimum of 400 ns then issue precharge all
command. NOP or Deselect applied during 400 ns
period.
¾ Step 5
¾ Issue MRS command to EMR2.
¾ Step 6
¾ Issue MRS command to EMR3.
SPD
DRAMs
Substrate
¾ Step 7
¾ Issue MRS command to EMR1 to enable DLL.
¾ Step 8
¾ Issue MRS command to MR0 to reset DLL.
¾ Step 9
¾ Issue a Precharge All command.
¾ Step 10
¾ Issue 2 or more Refresh commands.
¾ Step 11
¾ Issue a MRS command to MR0 with LOW to A8 to
program the desired device operation without
resetting the DLL.
¾ Step 12
¾ At least 200 clocks after resetting the DLL,
execute OCD Calibration (Off Chip Driver
impedance adjustment). If OCD calibration is not
used, issue a MRS command to EMR1 to set OCD
Calibration Default followed by issuing a MRS
command to EMR1 to exit OCD Calibration Mode
while also setting other operating parameters of
EMR1.
¾ Done!
¾ The standard says the DRAM is now ready for
normal operation, but the controller still needs to
train the timing.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization 152
¾ Step 1
¾ Apply power. Assert RESET# for at least 200 us with stable power.
RESET# is recommended to be less than 0.2 x VDD. All other inputs may
be undefined. Pull CKE Low at least 10 ns anytime before deasserting
RESET#. The VDD ramp time between 300 mv to VDDmin must be no
more than 200 ms. During the ramp, VDD > VDDQ and (VDD - VDDQ) <
0.3 volts.
¾ The voltage levels on all pins other than VDD, VDDQ, VSS, VSSQ must
be less than or equal to VDDQ and VDD on one side and must be larger
than or equal to VSSQ and VSS on the other side.
¾ VDD and VDDQ are driven from a single power converter output, AND
¾ VTT is limited to 0.95 V max once power ramp is finished, AND
¾ Vref tracks VDDQ/2.
OR
¾ Apply VDD, without any slope reversal, with or before VDDQ.
¾ Apply VDDQ, without any slope reversal, with or before VTT & Vref.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 153
¾ Step 2
¾ After RESET# is de-asserted, wait 500 us until CKE
becomes active. During this time, the DRAM will start
internal state initialization independently of external
clocks.
¾ Step 3
¾ Start and stabilize CK and CK# for at least 10 ns or 5
tCK (whichever is larger) before CKE goes active.
Since CKE is a synchronous signal, the
corresponding set up time to clock (tIS) must be met.
Also, a NOP or Deselect command must be
registered (with tIS set up time to clock) before CKE
goes active. Once the CKE is registered “High” after
RESET#, CKE needs to be continuously registered
“High” until the initialization sequence is finished,
including expiration of tDLLK and tZQinit.
¾ Step 4
¾ The DDR3 SDRAM keeps its on-die termination
(ODT) in high-impedance state as long as RESET# is
asserted. Further, the SDRAM keeps its ODT in high
impedance state after RESET# deassertion until CKE
is registered “High”. The ODT input signal may be in
undefined state until tIS before CKE is registered
“High”. When CKE is registered “High”, the ODT input
signal may be statically held at either “Low” or “High”.
If RTT_NOM is to be enabled in MR1, the ODT input
signal must be statically held “Low”. In all cases, the
ODT input signal remains static until the power up
initialization sequence is finished, including the
expiration of tDLLK and tZQinit.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 156
¾ Step 5
¾ After CKE is registered “High”, wait minimum of
Reset CKE Exit time, tXPR, before issuing the first
MRS command to load mode register. (tXPR=max
(tXS; 5 x tCK)
Mode Register 0
New/changed features
compared to DDR2 are
marked in red.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR3 Mode Registers 159
Mode Register 1
Mode Register 2
New/changed features
compared to DDR2 are
marked in red.
Mode Register 3
New/changed features
compared to DDR2 are
marked in red.
¾ Step 6
¾ Issue MRS Command to load MR2 with all
application settings.
¾ Step 7
¾ Issue MRS Command to load MR3 with all
application settings.
¾ Step 8
¾ Issue MRS Command to load MR1 with all
application settings and DLL enabled.
¾ Step 9
¾ Issue MRS Command to load MR0 with all
application settings and “DLL reset”.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 164
¾ Step 10
¾ Issue ZQCL command to start ZQ calibration.
¾ Step 11
¾ Wait until both tDLLK and tZQinit complete.
¾ Done!
¾ The standard says the DRAM is now ready for normal
operation, but the controller still needs to train the
timing. This is covered in the Read Calibration and
Write Leveling sections later.
Pentium 4 Pentium 4
Processor Processor
SMBus
PSB
x16 PCI Express
GFX 4 6
SA0 SA0
SA1 SA1 DDR2
GFX Root Complex SA2 SA2
1 7 1 1 8 1 1
¾ Ack/Nack overview:
¾ An Ack or Nack must be generated for every byte
that is transferred.
¾ The transmitter must release the SMBDAT during
the Ack/Nack clock period.
¾ An Ack will be signaled by asserting the SMBDAT
line low. If SMBDAT is not actively driven low the
external pull-ups will keep it high signaling a Nack.
¾ Ack/Nack overview:
¾ An Ack merely indicates that a byte was received
within the correct timing window. This does not
mean there was any error checking done like
Parity, ECC, or CRC.
1 7 1 1 8 1 1
S Start Condition
Sr Repeat Start Condition
Rd Read (bit value of 1)
Wr Write (bit value of 0)
A Acknowledge (this bit position may be 0 for Ack
and 1 for Nack)
P Stop Condition
Slave to Master
Master to Slave
1 7 1 1 8 1 1 7 1 1
S SLAVE ADDRESS WR A ADDRESS OFFSET A Sr SLAVE ADDRESS RD A …
8 8 1 8 1 1
DATA BYTE 1 A DATA BYTE 2 A …. DATA BYTE X N P
1 7 1 1 8 1 8 1 1
S SLAVE ADDRESS WR A COMMAND BYTE A DATA BYTE A P
¾ Quick Command
¾ Send Byte
¾ Receive Byte
¾ Read Byte/Word
¾ Write Byte/word
¾ Block Read
¾ Block Write
¾ Process call
¾ Host Notify
¾ All of these can support the PEC packet (Packet
Error Checking)
VDDQ
Voltage Crossing of Clock or Strobe
VIH(AC) Min
VIL(AC) Min
VSS
Delta TF Delta TR
Symbol Conditions
IDD0 Operating one bank active-precharge current;
CKE is high, CS is high between valid commands;
Address bus inputs and switching;
Data bus inputs are switching
IDD1 Operating one bank active-read-precharge current;
BL (Burst Length)=4;
CKE is high, CS is high between valid commands;
Address bus inputs and switching;
Data bus inputs are switching
IDD2P Precharge power down current;
All banks idle;
CKE is low;
Address and control bus inputs are stable;
Data inputs are floating
IDD2Q Precharge quiet standby current;
All banks idle;
CKE is high, CS is high;
Address and control bus inputs are stable;
Data inputs are floating
IDD2N Precharge standby current;
All banks idle;
CKE is high, CS is high;
Address and control bus inputs are switching;
Data inputs are switching
Symbol Conditions
IDD3P Active power down current; (typically broken into fast or slow power down)
All banks open (active);
CKE is low;
Address and control bus inputs are stable;
Data inputs are floating
IDD3N Active standby current;
All banks open (active);
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching
IDD4W Operating burst write current;
BL (Burst Length) =4
All banks open (active), Continuous burst writes;
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching
IDD4R Operating burst read current;
BL (Burst Length) =4
All banks open (active), Continuous burst reads;
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching
Symbol Conditions
IDD5B Burst refresh current;
Refresh command at every tRFC interval;
All banks open (active), Continuous burst reads;
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching
Termination
be length matched with
Mem their DQS.
Ctrl ¾ Command and address
are also length matched
to ensure that all
DMI
IO
Ctrl
Mem
Ctrl
DMI
IO
Ctrl
routing.
Mem
Ctrl
DMI
IO
Ctrl
routing.
¾ After RL, the DRAMs will
drive the requested data
back to the controller.
Mem
Ctrl
DMI
IO
Ctrl
routing.
¾ After RL, the DRAMs will
drive the requested data
back to the controller.
Mem ¾ Read data arrives at the
Ctrl controller at the same time.
DMI
IO
Ctrl
Mem
Ctrl
DMI
IO
Ctrl
IO
Ctrl
IO
Ctrl
• ODT Example: ODT reduces reflections from the stubs to DIMMs not being
addressed during write cycles. ODT is also used by most DRAM controllers
during read cycles. Settings are not intuitive when more than two DIMMs are
used. There are 3 settings 50, 75 and 150 Ohms
• One ODT signal per Rank.
groups
Mem
Ctrl
DMI
IO
Ctrl
¾ Read Cycle
¾ Read command is sent to
the DIMM over the address
CPU and command bus.
FSB
Mem
Ctrl
DMI
IO
Ctrl
¾ Read Cycle
¾ Read command is sent to
the DIMM over the address
CPU and command bus.
¾ The command gets to the
first DRAM on the new fly-
FSB
by routing chain.
Mem
Ctrl
DMI
IO
Ctrl
¾ Read Cycle
¾ Read command is sent to
the DIMM over the address
CPU and command bus.
¾ The command gets to the
first DRAM on the new fly-
FSB
by routing chain.
¾ The command propagates
to the remaining DRAMs
Mem
Ctrl
DMI
IO
Ctrl
CPU
FSB
Mem
Ctrl
DMI
IO
Ctrl
Mem
Ctrl
DMI
IO
Ctrl
lane arrives.
¾ The remaining data lanes
arrive one after the other.
Mem
Ctrl
DMI
IO
Ctrl
lane arrives.
¾ The remaining data lanes
arrive one after the other.
¾ What is the problem with
Mem this?
Ctrl
DMI
IO
Ctrl
T0 T1 T2 Ta Tb Tc Td Te Tf
CK#
CK
A[13:0] MPR
COMMAND NOP Read NOP NOP NOP NOP NOP NOP NOP
Preamble
DQS/DQS#
(from SDRAM)
D D D D D D D D
DQ 1 0 1 0 1 0 1 0
(from SDRAM)
Predetermined
Pattern
¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
FSB
Mem
Ctrl
DMI
IO
Ctrl
¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB
Mem
Ctrl
DMI
IO
Ctrl
¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB
Mem
Ctrl
DMI
IO
Ctrl
¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB
IO
Ctrl
¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB
IO
Ctrl
¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB
IO
Ctrl
CK
CK
DRAM
D
Q
DQS DQ
¾ In DDR2, the write data launch time is equal for all byte lanes of a DIMM,
sometimes even among two DIMMs within a channel
¾ This is achieved through flight-time length matching on the mother board and on the
DIMM.
¾ In DDR3, this is completely different due to the fly-by
command/address/control/clock bus topology on the DIMM:
¾ The write data launch time is different across the byte/nibble lanes of a DIMM.
¾ The write data launch time is different from one DIMM to another DIMM.
VDDQ
PU FET Control
SW1 SW2 SWN
DIMM DIMM
VTT VTT
RTT_Nom RTT_WR
DRAM Controller
1 0 1 RZQ / 8 30
1 1 0 RFU RFU
1 1 1 RFU RFU
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
ODT Register Settings 250
0 1 RZQ / 4 60
1 0 RZQ / 2 120
1 1 RFU RFU
VDD
PMOS
Rcomp[7:0]
Rcomp [7:] VDDQ
(256)
(128) (4) (2) (1)
Transmit Output
Signal PAD
(256)
(128) (4) (2) (1)
NMOS VSSQ
Rcomp[7:0]
RComp [7:0]
Front View
SPD
Top View
Pin 1
Pin 1
Pin A0
Top DRAM
Pin A0
Pin A5 Pin A6
Top DRAM
Bottom DRAM
1 2 3 4 5 6 7 8 9
mirrored: B
VSS VSSQ DQ0 DM VSSQ VDDQ
D
¾ A3 and A4 E
VSSQ NC DQS# VDD VSS VSSQ
¾ A5 and A6 F
VREFDQ VDDQ NC NC NC VDDQ
x4 DDR3 SDRAM,
looking through the package
DIFF CK
REFRESH
TIMER PLL
ADDR
CRTL
ADDR/CRTL MUX TIMING STATE
PADS
MACHINE ADDR/CRTL
ROW
REQUEST CMP
READ
READ AND
WRITE
WRITE BUFFER
CONTROL REGISTERS DATA
PADS
DATA/DQS
Memory address
Processor’s physical address and control signals
CS0#
Address CS1#
A47:A32 CS2#
Decoder CS3#
¾ Allows more DIMMs (more memory) per channel while maintaining signal integrity
¾ Existing RDIMM solutions only support two DIMMs per channel. FBDIMMs allow more.
¾ Address, Control and Data buffered
¾ Data buffers isolate the DRAM voltage and data stubs
¾ All interface signalling is differential
¾ Performance
¾ Faster processors require higher memory throughput
¾ Simultaneous Read/Writes to two Fully-Buffered DIMMs
¾ Up to 36 devices behind each buffer
¾ 256, 512, 1 & 2 Gbit DRAM support
¾ Input clock is ½ Dram base clock. Bit rate per lane is 6X the DRAM speed ie. 533MT/s = 3.2G
bit times.
¾ Cost Sensitive Market
¾ No DRAM changes, uses commodity DRAM chips
¾ DDR2 DIMM form factor/connector with industry available reference design and Gerbers
MCH
14
Narrow Point-to-
Point Interface
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
GDDR 278