WHKETaz Wa FPFT XBX

Download as pdf or txt
Download as pdf or txt
You are on page 1of 281

Modern DRAM (DDR2/DDR3)

Architecture

Min Huang(min.huang@ lecroy.com) Presentation Rev h1


Do Not Distribute mindshare.com © 2009
Legal Notice 2

This presentation is copyrighted material and is


intended solely for the use of students who
have attended this MindShare course. Any
other distribution is not allowed.
Please do not copy or distribute our material
without permission.

training@mindshare.com
1-800-633-1440

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
MindShare Learning Options 3

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
MindShare Courses 4

Intel Architecture IO Buses


¾ Intel Core 2 Processor (Penryn) ¾ PCI Express 2.0
¾ Intel Core Processor (Nehalem) ¾ USB 3.0
¾ Intel QuickPath Interconnect (QPI) ¾ Embedded USB 2.0 & Workshop
¾ Intel 64 and IA-32 Architecture ¾ PCI / PCI-X
¾ Intel PC and Chipset Architecture
¾ Intel Core 2 Processor and Chipset Memory Technology
Combo
¾ Modern DRAM Architecture
(DDR2/DDR3)
AMD Architecture
¾ AMD Opteron Processor (Barcelona) Virtualization Technology
¾ AMD64 Architecture ¾ PC Virtualization
¾ IO Virtualization (IOV)
x86 Programming
¾ Assembly Language Programming Storage Technology
¾ System Programming for the x86 ¾ SAS Architecture
Architecture
Min Huang(min.huang@ lecroy.com) ¾ Serial ATA Architecture
Do Not Distribute .com © 2009
MindShare Training Options 5

¾ In-House classroom
¾ Virtual classroom
¾ eLearning Courses

¾ Check our website for Public course offerings


¾ PCI Express 2.0
¾ USB 3.0
¾ Embedded USB 2.0 & Workshop
¾ Protected / Long Mode Programming
¾ Modern DRAM Technology
¾ High-Speed Design

¾ We can customize any of our classes to meet


your budget and content needs
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
MindShare eLearning Courses 6

¾ Some of our eLearning


courses include:
¾ Comprehensive PCI Express
¾ Intro to PCI Express
Changes
¾ Intro to PCI Express IO
Virtualization
¾ Intro to Virtualization
Technology
¾ and more …..

¾ Visit www.mindshare.com for a full list of our


eLearning courses.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
MindShare Books / eBooks 7

www.mindshare.com
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Register on MindShare’s Website 8

¾ Receive our quarterly newsletter

¾ Download course presentations


and eBooks

¾ Shop for eLearning modules and


Books/eBooks

¾ Register for Public courses

www.mindshare.com
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DRAM Topics 9

System Architecture Additional DDR3 Topics


Timing and Electrical Differences
DRAM Feature Summary Fly-by Routing Read Example
Intro to DRAM Read Calibration
Historical Background Fly-by Routing Write Example
Why DRAM? Write Leveling
DDR3 On-Die Termination
DRAM Cell Architecture
ZQ Calibration
DRAM Chip Architecture Reset
DRAM Modules On-DIMM Address Mirroring

Commands and Waveforms DRAM Controller Basics


DDR Initialization Addresses
SMBus Overview
Alternative DRAM Solutions
Electrical Specifications
DDR1 and DDR2 Routing
On-Die Termination
Errors
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Objectives 10

Upon completion, the student should: Remember the


¾ Understand the DRAM bank state diagram quick reference
¾ Understand the JEDEC initialization procedure handout
¾ Understand the timing waveforms
¾ Know why the DRAM controller has to do so much of the work
¾ Know what motivates the move to DDR3 and the associated problems
¾ Know what elements of the system determine the address of a DRAM cell
¾ Know these terms and their concepts:
Prefetch width
Refresh vs. auto refresh vs. self refresh vs. auto self refresh
Activate
Precharge vs. auto precharge vs. precharge all
Additive latency
Bank vs. rank
SSTL
ODT, dynamic ODT
ZQ calibration
Fly-by routing, read calibration, and write leveling
SPD and Mode Registers
On-the-fly burst chop (the Austin Powers mode)
Unbuffered vs. registered vs. fully-buffered
Min 1T
Huang(min.huang@
vs. 2T timing
lecroy.com)
Do Not Distribute .com © 2009
Where’s the Money? 11

2007
DRAM
Revenue
Maker
$ billions
2007
Samsung 8.7 Logic
Revenue
Maker
Hynix 6.7 $ billions
Qimonda 4.0 Intel 39.2
Elpida 3.8 AMD 6.3
Micron 3.2
Nanya 1.6
Powerchip 1.4
So who has the greatest responsibility
ProMos 1.1 for new features such as fly-by?
Etron 0.4 Follow the money.
Elite 0.2
Others 0.4
Total 31.5

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
System Architecture

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Intel Basic Desktop 13

Desktop Chipset PCIe Support


¾ MCH hosts a 16-bit PCIe
Link interface for high-
performance graphics.
¾ DMI (Direct Media Interface)
is an Intel-specific version of
x4 PCIe.
¾ ICH (IO Controller Hub)
supports multiple x1 PCIe
Links, four of which may
operate as x1 Links or
ganged together as a single
x4 Link.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Intel Server System 14

Server Chipset PCIe Support


¾ MCH hosts 16 PCIe Lanes,
organized as 2x8, 4x4, etc.
¾ ESI (Enterprise Server Interface) is
a variant of x4 PCIe used for
general chipset traffic.
¾ An additional x4 or x8 Link can
connect the MCH and ICH for
improved DMA performance.
¾ The Enterprise Server Bridge ICH
supports two additional PCIe x4
Links.
¾ Note presence of both PCI and PCI-
X slots.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
AMD System Overview 15

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
AMD Quad Core Greyhound Processor 16

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Example 4-Way System 17

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Possible Intel Nehalem Topology 18

Main Main
Memory CPU CPU Memory

DRAM

DRAM
DDR3

Cntlr

Cntlr
DDR3

QPI

QPI
DDR3 QPI QPI DDR3
QPI QPI

QPI QPI

PCIe
PCIe
QPI to PCIe

ESI
ESI
SMBus
PCIe ICH
PCI
IDE
LAN USB
AC’97
LPC
Keybrd BIOS
Floppy Super
IO Com1,2
Mouse
Min Huang(min.huang@ lecroy.com) Printer
Do Not Distribute .com © 2009
DRAM Feature Summary

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DRAM Overview 20

Clock Bandwidth per


Data Bus Rate
Device Architecture Explanation Frequency 64-bit channel
(MT/s)
(MHz) (MB/s)

Fast Page Mode


FPM DRAM NA - -
DRAM

Extended Data Out


EDO DRAM NA - -
DRAM

SDR SDRAM Single Data Rate


66 – 133 66 – 133 533 – 1066
PC66 – PC133 Synchronous DRAM
DDR SDRAM
Double Data Rate
DDR200 – DDR400 100 – 200 200 – 400 1600– 3200
Synchronous DRAM
PC-1600 – PC-3200
DDR2 SDRAM
Double Data Rate 2
DDR2-400 – DDR2-800 200 – 400 400 – 800 3200 – 6400
Synchronous DRAM
PC-3200 – PC-6400
DDR3 SDRAM
Double Data Rate 3
DDR3-800 – DDR3-1600 400 – 800 800 – 1600 6400 – 12800
Synchronous DRAM
PC-6400 – PC-12800

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DRAM Overview 21

SDRAM DDR1 DDR2 DDR3


Performance 66-133MT/s 200-400MT/s 400-800MT/s 800-1600MT/s
VDDQ 3.3 Volts 2.5 Volts 1.8 Volts 1.5 Volts
VTT NA ½ VDDQ ½ VDDQ ½ VDDQ
IO Interface Logic LVTTL SSTL_2 SSTL_18 SSTL_15
Organization x4, x8, x16 x4, x8, x16 x4, x8, x16 x4, x8, x16
Density 16Mb-512Mb 64Mb-2Gb 256Mb-4Gb 512Mb-8Gb
Number of Banks 4 4 4 (256Mb-512Mb) 8
8 (1Gb-4Gb)
Package TSOP TSOP2/BGA BGA BGA (mirrored option)
Prefetch 1 2 4 8
Burst Length 1, 2, 4, 8, Page 2, 4, 8 4, 8 8 (chop 4)
Clock Single Ended Differential Differential Differential
Strobes NA Single-Ended (SE) SE or Differential Differential
DQ Driver Strength Wide Envelope Narrow Envelope 18 Ohm OCD ZQ cal
Termination NA Mother Board Mo Bo, Dyn ODT DIMM, Dyn ODT
Read Latency CL=1,2,3 CL=1.5, 2, 2.5, 3 CL=2, 3, 4, 5, 6 CL=5, 6, 7, 8, 9, 10, 11
Additive Latency NA NA AL=0, 1, 2, 3, 4 AL=0, CL-2, CL-1
Burst Interrupts Yes Yes R-R, W-W 4n only Burst Chop

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
JEDEC 22

Since the mid 1990s, JEDEC (was the Joint


Electron Device Engineering Council, now the
Solid State Technology Association) has controlled
the DRAM standards. Specs can be found at:

http://www.jedec.org

¾DDR1 spec is JESD79E


¾DDR2 spec is JESD79-2C
¾DDR3 spec is JESD79-3C
¾Package specs are MO-207 (for example)
¾DIMM and SPD specs are JESD21C
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Intro to DRAM

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Historical Background

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
History of DRAM 25

1950 1960 1970 1980

1949 Pulse Ferrite core Intel announces Page Mode


Transfer core memory moves the 1103 1K introduced
invented into high-volume device
Tubes primary Coincident current production First silicon Core memory
source of core invented by Forester memory devices used in the first
early memory for flight simulator invented by IBM space shuttle for its
robustness

¾ Vacuum tubes were the first electronic devices used for


memory. Tubes were plagued with problems like excessive
heat, size, and voltage as well as being unreliable and
temperamental. This set the stage for a magnetic “core
memory”.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
History of DRAM 26

¾ Magnetic core memories


dominated the market for
more than two decades.
¾ The basic concept is
changing the magnetic
properties of a ferrite ring
using current.
¾ It takes 3 wires through
each ring to read, write,
and erase a bit.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
History of DRAM 27

1980 1990 2000 2010

Fast Page EDO memory DDR introduced DDR3


Mode introduced becomes at 200MT/s spec released
at 28.5 MHz available
Intel exits SDRAM introduced DDR2 introduced
DRAM business JEDEC holds memory at 400MT/s
standards from
here on out

¾ With the invention of the PC in 1981, silicon-based memory


demand skyrocketed. Other countries like Japan begin to
manufacture low cost DRAMs. By 1985, Japan has nearly
dominated the market. US lawmakers team up with silicon
manufacturers to equalize the playing field. New laws against
price gouging are instituted.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Why DRAM?

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
RAM 29

¾ What is RAM?
¾ Random Access Memory.
¾ Why is it called Random access memory?
¾ Previously there were other kinds of memory
that were sequentially accessed. These other
kinds of memory were usually in the form of
magnetic tape or drum. To my knowledge the
term SAM was never coined.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Types of RAM 30

¾ What are the 2 types of RAM?


¾ Static RAM and Dynamic RAM.
¾ Which is better?
¾ Well it depends on the application.

Size of each Power


Type of RAM Cost per cell Speed
cell Dissipated
Dynamic
Low Very small High Slow
RAM

Static RAM High Large Low High

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
SRAM Example 31

¾ SRAM Cell
¾ A SRAM cell is composed of many
transistors.
¾ The cell consists of two cross-coupled
CMOS inverters that store one bit of
information, and two N-type transistors that
connect the cell to the bitlines.
¾ To read the information, the word line is
activated while the external bit line drivers
are switched off. Therefore, the inverters
inside the SRAM cell drive the bitlines,
whose value can be read-out by external
logic.
¾ To write new data into the cell, the big
(external) tristate drivers are activated to
drive the bitlines. Next, the word line
transistors are enabled. Because the
external drivers are much bigger than the
small transistors used in the 6T SRAM cell,
they easily override the previous state of
the cross-coupled inverters.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DRAM Cell Architecture

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Cell Architecture 33

¾ DRAM Cell (conceptual)


¾ A DRAM cell is composed of a
capacitor and a transistor.
¾ The data is stored in the capacitor.
¾ Capacitors lose charge over time
due to leakage (dissipation), so it is Bit Line
necessary to recharge them
periodically. This is called Refresh
and is controlled by the chipset.
¾ The act of reading the capacitor is
destructive because the charge in
the cap is so small (<30fF). The data
in the cap is always read out to a
Sense Amp and then written back at
a later time (Precharge). Word Line

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Cell Architecture 34

¾ DRAM Cell Read


1. Bit line is precharged to
Sense Amp threshold
voltage (VDD / 2).
Bit Line
2. Word line turns on
transistor to allow charge
to flow from capacitor to
bit line. (Read from Cell)
3. Sense Amp latches
electrical high or low.
Word Line

To Sense Amp

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Cell Architecture 35

¾ DRAM Cell Write or


Precharge
1. Charge or discharge the bit
line to core voltage or GND
as needed to store 0 or 1. Bit Line
DRAM manufacturer decides
what state (1 or 0) is
represented by a charged
cell.
2. Word line turns on transistor
to allow charge to flow from
bit line to capacitor.
Word Line
3. During writes the word line
may need to be forced above
VDD to take care of voltage
drop. To Sense Amp

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Open Cell Array Architecture 36

Word Line N

Word Line N1

Word Line N2

Bit Line Bit Line Bit Line


Min Huang(min.huang@
To Sense Amp lecroy.com)
Do Not Distribute .com © 2009
Folded Cell Array Architecture 37

¾ Folded DRAM Cell


(conceptual) Bit Line Bit Line #

¾ Adjacent bit lines are


organized as differential
inputs to the sense
amplifier.
¾ During a read operation
Word Line N
one bit line acts as the
reference input to the
sense amp.
¾ Folded bit lines have better
noise immunity due to
equal coupling of noise on
the neighboring reference Word Line N1
bit line.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Folded Cell Array Architecture 38

Word Line N

Word Line N1

Word Line N2

Word Line N3

Bit Line Bit Line# Bit Line# Bit Line Bit Line Bit Line#

Min Huang(min.huang@
To Sense Amp lecroy.com)
Do Not Distribute .com © 2009
Sense Amplifier Architecture 39

¾ Sense Amplifier (conceptual)


¾ The sense amplifier is a differential latch.
SAP
¾ Example of reading a “0” BL BL#
¾ Bit lines are charged to VDD/2, SAN (Sense
Amp Negative) = VDD/2, SAP (Sense Amp
Positive) = 0. The latch is off.
¾ When word line is asserted, one of the bit lines
changes by delta V while the other bit line
remains unchanged.
¾ SAN is discharged to 0
¾ As SAN falls below the higher of the bit lines,
the corresponding NMOS transistor turns on
and starts pulling the lower bit line to 0.
¾ After a small delay, SAP is pulled high. The
cross coupled PMOS transistor pulls the higher
bit line voltage up to VDD
¾ Isolation and equalization circuits not shown SAN

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DRAM Chip Architecture

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
The First DRAMs 41

¾ The first LSI (Large Scale Integration)


DRAMs were manufactured by Intel in 1970
starting with the 1103 chip.
¾ The 1103 had a 1,024-bit array with one data
line. The following examples show the
evolution of the DRAM.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
The Array 42

¾ The core of the chip is the Array block.


¾ The total size of the chips is determined by
multiplying the number of Rows by the number of
Columns by the number of Banks.
¾ There may be more than one bit at each individual
column address.
¾ This total size is referred to as the Chip Technology
and is always stated in bits not bytes.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Split Address Scheme 43

¾ 4096 bit Array X1 example (212 Bits)


ADD
Row Address Decoder (64 Rows) ADD ADD ADD RAS CAS

ADD WE

ADD Sense Amplifier DATA Bit lane

Column Address Decoder (64 Columns)

PAD PAD PAD PAD PAD CS

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Problems: Muxed Address, Precharge, Refresh 44

Each time a DRAM location is read, the capacitor associated with that cell
is discharged. Before the device is read again, DRAM logic must recharge
these locations, which takes some time—called the precharge delay.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Simple Async DRAM Controller Design 45

Top Of Chip
Memory Select Timing
There’s no handshake!
A[n:20]
Address Generator The DRAM controller has to know
Decoder EVERYTHING about the DRAM and it
has to micromanage the DRAM.
Select
Host Physical Address

A[19:10] RAS#
CAS#
WE# (Write Enable)
DRAM
Address
Memory Address Bus
Mux DRAM

A[9:0]
Data Bus e.g. 64-bit

DRAM Controller
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
8088 System Memory Bus 46

INTR 8-Bit Data Bus


8088
NMI CPU

DRAM
Error System
Buffers
Handling Bus Controller
and Memory
Reset DRAM
Logic Controller

Clock
Logic Buffers Buffers
PC Expansion Slots

X-Bus 8-bit Expansion Bus

8259 8253 8255 8237


Interrupts Timers Keyboard DMA

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Increasing the Bus Size 47

¾ The 8088 memory bus was 8 bits wide.


¾ By using 8 chips with 1 data line (X1) and a single
chip select to all 8 chips, the chips would perform in
unison to drive the 8-bit bus.
¾ As technology evolved, the DRAM controllers
demanded wider buses to increase speed.
¾ In response, the DRAM manufacturers made the
arrays bigger and added more data lines to the chips.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Multiple Data Lines 48

16M bit Array example; 4M X 4


Sometimes shown as density X data width X #banks (4M X 4 X 1)

ADDR
Row Decoder (4096 Rows) ADDR ADDR ADDR ADDR ADDR

RAS DATA Bit lane

CAS DATA Bit lane

WE DATA Bit lane

Sense Amplifier
CS DATA Bit lane
Column Decoder (1024 Columns)

ADDR ADDR ADDR ADDR ADDR ADDR

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
How to Increase Speed of the Chip 49

¾ One of the first changes in architecture to increase


the speed of modern chips was to split the array into
multiple banks.
¾ This allows more than one bank to be active at the
same time.
¾ This can effectively “hide” precharge/activate and
other timings related to single bank implementations.
¾ Refresh timing is not affected since all the banks are
refreshed at the same time.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Banks and Toggle-Mode Addressing 50

Bank 1, A2 = 1 Bank 0, A2 = 0 Intel 80486 has a 32-


3C 38 bit data bus and 16-
34 30 byte cache lines.

2C 28 Intel’s toggle-mode
24 20 burst line fill address
sequence optimized
1C 18
DRAM performance
14 10 regardless of whether
C 8 DRAM was single-
bank 32-bit memory
4 0
(single SIMM
channel), dual-bank
32-bit memory, or 64-
MUX bit memory. The
sequence is shown in
the Commands and
Waveforms section of
32-bit 486 data bus this slide show.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Multiple Banks 51

16M bit Array example; 1M X 4 X 4


Shown as density X data width X #banks

ADDR ADDR ADDR ADDR ADDR ADDR BA

RAS DATA Bit lane


Row Decoder

Row Decoder
CAS DATA Bit lane
Sense Amplifier Sense Amplifier
Column Decoder Column Decoder

Row Decoder
Row Decoder

WE DATA Bit lane

Sense Amplifier Sense Amplifier


Column Decoder Column Decoder
CS DATA Bit lane

ADDR ADDR ADDR ADDR ADDR ADDR BA

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Multiple Data Lines 52

¾ Chips can be organized into data bus widths of 4, 8, 16, and 32.
¾ This was shown in the previous example as number of bits X (by)
number of data lines X (by) number of banks. The number of bits is
referred to as the density.
¾ Here is an exercise of how the same technology can yield different chip
organizations.

Density Data Lines Banks


32M 4 4
For this example,
use a 512 Mb 16M 8 4
technology chip.
8M 16 4
4M 32 4
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
SDRAM Architecture 53

Features of an SDRAM Chip


¾ PC 100, PC 133
¾ Fully Synchronous
¾ Internally pipelined column address can be changed every
clock. (1T timing)
¾ Multiple internal banks for hiding row access/precharge
timing. Single bank precharge or all bank precharge.
¾ Programmable burst length 1, 2, 4, 8, or full page (Row)
¾ Self refresh and low power states
¾ LVTTL 3.3 volt operation

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
SDRAM Architecture 54

Architecture of a 512 Mb SDRAM chip organized as 32M X 4 X 4

Refresh Bank 3
Counter Bank 2
CKE

13
Bank 1
Control
DM
CK Logic
Row
Command

CS# Bank 0
Decode

13
Address Bank 0 Data
WE# MUX Memory
CAS# Row Output 4
Address 8192
Array
RAS# 8192x4096 Registers
Latch & X4
Decoder
Mode Registers Sense Amps
16384 DQ 0-3
2
13

I/O Gating 4
DM Mask
Bank Logic
Control
12

4096
Logic X4
Data Input
2 Column Registers
4
Decoder
Column
Address 12
A0-12 15 12 Address
Registers
BA 0,1 Counter/
Latch

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR1 SDRAM Architecture 55

Features of a DDR1 Chip


¾ DDR 200, 266, 333, and 400
¾ Bidirectional Data Strobes for source synchronous capture
(x16 has 2)
¾ DDR architecture two data accesses per clock
¾ 2n prefetch architecture
¾ Multiple internal banks for hiding row access/precharge
timing.
¾ Differential clock inputs (CK and CK#)
¾ Commands active on rising edge of clock.
¾ DQS edge aligned for data reads and center aligned for
writes.
¾ Addition of DLL (Delay Lock Loop) to align DQS and Data
¾ Programmable burst length of 2, 4, and 8
¾ Auto refresh and Self refresh modes
“Auto refresh” is called “Refresh” for DDR3.
¾ 2.5 volt operation for I/O (SSTL_2 compatible) 2.4 volts for
400MT/s DRAMs

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR1 SDRAM Architecture 56

Purple
area is
Architecture of a 512 Mb DDR1 chip organized as 32M X 4 X 4 running on
both edges
Clk
Refresh Bank 3 of clock
Counter Bank 2 8 Read

DLL
CKE Latch

13
Bank 1
CK# Control
4 4
CK Logic
Row 4
Command

CS# Bank 0
Decode

13 MUX
Address Bank 0 Col 0
WE# MUX Memory
CAS# Row
Address 8192
Array
RAS# 8192x2048
Latch & X8 DQS 1
Drivers
Decoder Gen.
Mode Registers Sense Amps
16384 DQ 0-3
2
13

DQS
I/O Gating DQS
8
DM Mask
Bank Logic
Control Receivers DM
15

2048
Logic
X8

Input Registers
1 4
2 Column Write 2
Decoder FIFO
Column 8 8
Address 11 &
A0-12 15 12 Address
Registers Drivers
BA 0,1 Counter/ Clk
Latch
Col 0

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 SDRAM Architecture 57

Features of a DDR2 Chip


¾ DDR 400, 533, 667 and 800
¾ Bidirectional Differential Data Strobes (DQS, DQS#) for
source synchronous capture (x16 has 2 pairs)
¾ Duplicate output strobes (RDQS) for x8 devices
¾ 4n prefetch architecture
¾ 4 Banks for concurrent operation 1Gb devices have 8 Banks
¾ Differential clock inputs (CK and CK#)
¾ Programmable CAS latency
¾ Posted CAS additive latency
¾ On Die Termination (ODT)
¾ Programmable burst length of 4 or 8
¾ Auto refresh and Self refresh modes
“Auto refresh” is called “Refresh” for DDR3.
¾ 1.8 volt operation for I/O (SSTL_18 compatible)

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 SDRAM Architecture 58

Purple
area is
running on
Architecture of a 512 Mb DDR2 chip organized as 32M X 4 X 4
both edges
of clock
Refresh Bank 3
Counter Bank 2 16 Read Clk

DLL
CKE Latch

14
Bank 1
CK# Control
4 4 4 4
CK Logic
Row 4
Command

CS# Bank 0
Decode

14 MUX
Address Bank 0 Col 0,1
WE# MUX Memory
CAS# Row
Address 16384
Array
RAS# 16384x512
Latch & DQS

ODT Resistor Network


X 16 2
Drivers
Decoder Gen.
Mode Registers Sense Amps
DQ 0-3
8192
2
14

DQS DQS#
I/O Gating 16 DQS
DM Mask DQS#
Bank Logic
Control Receivers
512 DM
16

Logic X 16

Input Registers
1 4
2 Column Write
Decoder FIFO
Column 16
Address 9 &
A0-13 15 11 Address
Registers Drivers
BA 0-2 Counter/ Clk
Latch
Col 0,1

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 SDRAM Architecture 59

Features of a DDR3 Chip


¾ 8n prefetch architecture
¾ Fly-by routing which requires read delay calibration and write
leveling.
¾ On-die termination (ODT) improvements
¾ ZQ calibration (OCD) redefined
¾ Speeds up to 1600MT/s (800 MHz)
¾ Burst Chopping
¾ MRS definition changed
¾ VDDQ goes to 1.5 volts and VREF and VTT are ½ VDDQ
¾ Addition of asynchronous reset pin
¾ Package changes to accommodate pin mirroring and support
balls

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Core Speed vs. I/O Speed 60

Memory Input Prefetch DRAM Core


Bus Speed
Technology Clock Width Speed

SDRAM-100 100MHz 1 100MT/s 100MHz

DDR-200 100MHz 2 200MT/s 100MHz

DDR-400 200MHz 2 400MT/s 200MHz

DDR2-400 200MHz 4 400MT/s 100MHz

DDR2-800 400MHz 4 800MT/s 200MHz

DDR3-800 400MHz 8 800MT/s 100MHz

DDR3-1600 800MHz 8 1600MT/s 200MHz


Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Manufacturing 61

¾ Many fabrication processes have evolved


over the last 35 years for DRAMs.
¾ One of the biggest challenges are creating
the capacitors. Some of these basic concepts
are:
¾ Planar
¾ Deep Trench

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Redundancy 62

¾ As with all silicon products flaws occur during


the manufacturing process.
¾ To keep costs low DRAM manufacturers build
redundant Rows into the array to increase
yields.
¾ When the Sort/Test process is done “bad”
Rows are identified and a new row is
assigned in the redundant region of the array.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Redundancy 63

Redundancy Row Decoder

Fuse Block Redundancy Block

Row Decoder
Array Block
Row Address Path

Sense Amplifier
Column Decoder

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Package 64

¾ After silicon manufacturing the die are ready


for packaging.
¾ Package type is going to affect electrical
performance of the chip. TSOPs (Thin Small
Outline Package) have been used up until the
higher speed DDR required a change to BGA
(Ball Grid Array)

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
TSOPs Still in Production 65

¾ This 66-pin TSOP is used in


DDR1 for x4, x8, and x16.
¾ With DDR1, it was not
necessary to go to a BGA
package for electrical
performance issues.
¾ Many manufacturers did use
BGA packages for DDR1 in
order to build high density
modules.

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
BGA Package 66

¾ This is a 60-pin package used


for x4 and x8 DDR2. 1 2 3 4 5 6 7 8 9

¾ JEDEC wrote in the DDR2 A


VDD RDQS# VSS VSSQ DQS# VDDQ

spec that TSOP packages B


DQ6 VSSQ DM/RDQS DQS VSSQ DQ7

were not to be used. C


VDDQ DQ1 VDDQ VDDQ DQ0 VDDQ

¾ This package is JEDEC D


DQ4 VSSQ DQ3 DQ2 VSSQ DQ5

E
standard and is referred to as VDDL VREF VSS VSSDL CK VDD

F
MO-207. CKE WE# RAS CK# ODT

G
¾ Ball-out depends on die H
BA2 BA0 BA1 CAS# CS#

A10 A1 A2 A0 VDD
organization. J
VSS A3 A5 A6 A4

K
A7 A9 A11 A9 VSS

L
VDD A12 RFU RFU A13

Looking through the package

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
BGA Packages in Production 67

¾ This is the 84-pin version 1 2 3 4 5 6 7 8 9

of the MO-207 package A


VDD NC VSS VSSQ UDSQ# VDDQ

B
used for x16 DDR2 C
DQ14 VSSQ UDM UDQS VSSQ DQ15

devices. D
VDDQ DQ9 VDDQ VDDQ DQ8 VDDQ

DQ12 VSSQ DQ11 DQ10 VSSQ DQ13

¾ A DDR3 BGA package is E


VDD NC VSS VSSQ LDQS# VDDQ

F
shown in the On-DIMM G
DQ6 VSSQ LDM LDQS VSSQ DQ7

Mirroring section of this


VDDQ DQ1 VDDQ VDDQ DQ0 VDDQ

H
DQ4 VSSQ DQ3 DQ2 VSSQ DQ5

presentation. J
VDDL VREF VSS VSSDL CK VDD

K
CKE WE# RAS CK# ODT

L
BA2 BA0 BA1 CAS# CS#

M
A10 A1 A2 A0 VDD

N
VSS A3 A5 A6 A4

P
A7 A9 A11 A9 VSS

R
VDD A12 RFU RFU A13

Looking Through from Top

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DRAM Modules

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
The First SIMMs 69

¾ In the 486 and earlier processors, as the memory


data bus width grew, the need to integrate monolithic
DRAMs onto a module grew with it.
¾ This started with the SIMM or Single Inline Memory
Module.
¾ SIMMs only have gold finger connections on one side
of the module as well as all of the necessary
Command/Address and PWR/GND.
¾ The module matched up with the 386 and 486 32-bit
processor buses.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
The First Modules 70

¾ To speed the
Pentium and K5 to
market, the same
32-bit SIMMs were
used.
¾ Due to the Pentium
and K5’s 64-bit
memory data bus
two SIMMs were
used.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Pentium System Overview 71

Pentium
Optional
430 HX/TX Chipset Processor L2 Cache
SRAM
w/ Graphics on PCI FSB

Fast-Page or
North Bridge
(Intel 430 TX/HX) EDO SIMMs

PCI Slots
PCI-33MHz

IDE Ethernet SCSI


CD HDD
PIIX 3/4
USB South Bridge

ISA

Boot Modem Audio Super


ROM Chip Chip IO

COM1
COM2
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Dual Inline Memory Module (DIMM) 72

¾ Gold fingers on both sides accommodating a full 64-


bit-wide data bus and 8 bits of ECC as well as all of
the necessary Command, Address, power and
ground pins.
¾ 168-pin DIMMs, used for SDRAM
¾ 184-pin DIMMs, used for DDR SDRAM
¾ 240-pin DIMMs, used for DDR2 SDRAM
¾ 240-pin DIMMs, used for DDR3 SDRAM
¾ Key is different than DDR2 DIMMs
¾ Beware that pins are numbered differently (such as
location of pin 2) for SO-DIMMs than for UDIMMs.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Unbuffered Modules 73

¾ The most common DIMMs are Unbuffered.


¾ These DIMMs have 3 basic components:
¾ The PCB substrate
¾ The DRAMs
¾ The SPD (Serial Presence Detect)
¾ These are mostly for the Desktop and Mobile markets.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Registered Modules 74

¾ The other very common type of DIMM used in server


and workstation PCs is the Registered DIMM. Unlike
an Unbuffered DIMM where all of the signals are
routed directly to the DRAMs, the Registered DIMM
uses registers and a PLL for the command, address,
and clocks.
¾ The PCB substrate
¾ The DRAMs
¾ The SPD (Serial Presence Detect)
¾ Registers
¾ PLL
¾ The registers and PLL reduce electrical loading on
the most heavily loaded signals.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Ranks 75

¾ The DRAMs on DIMMs are ordered into Ranks.


¾ A Rank consists of enough DRAMs to drive the full
width of the system data bus. In most systems this is
64 bits wide or 72 bits with ECC.
¾ A quadword is 64 bits (eight bytes) using the Intel
Architecture definition that one word is 16 bits.
¾ This concept used to be referred to as single sided
and double sided.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Ranks 76

• Examples of Rank Configurations:


– Shown with ECC
Dual Rank x4

Rank 1 Rank 2

Dual Rank x8
Rank 1
Rank 2

Single Rank x4
Rank 1
Single Rank x8
Rank 1

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Module Types Unbuffered 77

SPD
DRAMs

Substrate

Shown with
optional ECC ECC
chip. Typically
unbuffered
DIMMs do not
have ECC
support.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Unbuffered Logical Example 78

CS0#

DQS0# DQS1# DQS2# DQS3#


DQS0 DQS1 DQS2 DQS3
DM0 DM1 DM2 DM3
DM CS DQS DQS# DM CS DQS DQS# DM CS DQS DQS# DM CS DQS DQS#
DQ0 DQ0 DQ8 DQ0 DQ16 DQ0 DQ24 DQ0
DQ1 DQ1 DQ9 DQ1 DQ17 DQ1 DQ25 DQ1
DQ2 DQ2 DQ10 DQ2 DQ18 DQ2 DQ26 DQ2
DQ3 DQ3 DQ11 DQ3 DQ19 DQ3 DQ27 DQ3
DQ4 DQ4 DQ12 DQ4 DQ20 DQ4 DQ28 DQ4
DQ5 DQ5 DQ13 DQ5 DQ21 DQ5 DQ29 DQ5
DQ6 DQ6 DQ14 DQ6 DQ22 DQ6 DQ30 DQ6
DQ7 DQ7 DQ15 DQ7 DQ23 DQ7 DQ31 DQ7

DQS4# DQS5# DQS6# DQS7#


DQS4 DQS5 DQS6 DQS7
DM4 DM5 DM6 DM7
DM CS DQS DQS# DM CS DQS DQS# DM CS DQS DQS# DM CS DQS DQS#
DQ32 DQ0 DQ40 DQ0 DQ48 DQ0 DQ56 DQ0
DQ33 DQ1 DQ41 DQ1 DQ49 DQ1 DQ57 DQ1
DQ34 DQ2 DQ42 DQ2 DQ50 DQ2 DQ58 DQ2
DQ35 DQ3 DQ43 DQ3 DQ51 DQ3 DQ59 DQ3
DQ36 DQ4 DQ44 DQ4 DQ52 DQ4 DQ60 DQ4
DQ37 DQ5 DQ45 DQ5 DQ53 DQ5 DQ61 DQ5
DQ38 DQ6 DQ46 DQ6 DQ54 DQ6 DQ62 DQ6
DQ39 DQ7 DQ47 DQ7 DQ55 DQ7 DQ63 DQ7

BA0-2 CK0
A0-A13 CK0#
ODT SDA SPD SCL
RAS# CK1
CKE
CAS# CK1#
WE# SA0-2 WP/GND CK2
Min Huang(min.huang@
5.1 Ohms lecroy.com)
All resistors are 22 Ohm unless stated
CK2#

Do Not Distribute .com © 2009


Rank Example 79

• This example assumes 512Mb DRAMs on the DIMMs.


• Total size = number of chips X size of chip / 8

Configuration # of chips Ranks Total Size


Unbuffered
Mobile Desktop x16 4 Single 256MB

x16 8 Dual 512MB

x8 8 Single 512MB
Registered
Server x8 16 Dual 1GB
Workstation
x4 16 Single 1GB

x4 32 Dual 2GB
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DIMM Configuration Unbuffered 80

x8 Single Rank, 8 components

Front View Side View


SPD

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DIMM Configuration Unbuffered 81

x8 Dual Rank, 16 components

Front View Side View


SPD

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DIMM Configuration Unbuffered 82

x16 Single Rank, 4 components

Front View Side View


SPD

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Module Types Registered 83

SPD
DRAMs

Substrate
Register PLL Register Registers and
PLL

ECC

Register PLL Register

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DIMM Configuration Registered 84

x4 Single Rank, 18 components (with ECC)

Front View Side View


SPD

ECC

Register PLL Register

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DIMM Configuration Registered 85

x4 Dual Rank, 36 components (with ECC)

Front View Side View


SPD

ECC

Register PLL Register

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DIMM Configuration Registered 86

¾ x8 Single Rank, 8 components (with ECC)

Front View Side View


SPD

ECC
Register PLL Register

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DIMM Configuration Registered 87

¾ x8 Dual Rank, 18 components (with ECC)

Front View Side View


SPD

ECC

Register PLL Register

Top View

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Module Types 88

UDIMM: Unbuffered Desktop standard FBDIMM: Fully Buffered Server

VLP RDIMM: Very Low Profile


RDIMM: Registered Server standard
Computing and Networking

MiniDIMM:
SODIMM: Notebook standard Computing and Networking

VLP MiniDIMM:
Computing and Networking

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 Module Pin Description 89

Clock (CK, CK#)


¾ Input Differential For DDR2 single ended for DDR1
¾ Address & control signals sampled on the crossing of the
positive edge of CK & the negative edge of CK#
¾ Output (read) data is referenced to the crossing of CK and
CK# (both directions)
¾ One pair to each DIMM in Registered DIMM and 3 pairs for
Unbuffered. For SODIMMs one of the clocks can be shut off

Clock Enable (CKE)


¾ Input High
¾ HIGH activates, LOW deactivates internal CK signals, input
buffers and output drivers.
¾ Must remain high throughout read/write accesses. Primarily
used for self refresh and sleep states like active power
down.
¾ Each Rank has its own CKE
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Module Pin Description 90

Chip Select (CS0#-CS3#)


¾ Input Low
¾ All commands are ignored when CS# is inactive (considered
part of each command)
¾ Each rank has its own chip select.
¾ Which CS# is asserted for a given physical address is based
on the programming of the memory controller (DRB’s –
DRAM Rank Boundary Registers)
¾ How many ranks can be supported on a single DIMM.

ODT
¾ Input High
¾ On Die Termination enables internal termination resistors to
the following signals: DQ, DQS, DQS#, CB and Data Mask.
¾ Every Rank has its own ODT signal.
¾ ODT will be discussed more later
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Module Pin Description 91

Row Address Strobe (RAS#), Column Address Strobe (CAS#),


Write Enable (WE#)
¾ Input Low
¾ Define (along with CS#) the command being entered.
¾ Command truth table defined in JEDEC spec
¾ RAS#, CAS#, and WE# are broadcast to each chip.

CS[0-3]#

RAS#, CAS#, WE#

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 Module Pin Description 92

Bank Address (BA 0-2)


¾ Input High
¾ Define to which bank an Active, Read, Write or Precharge
command is being applied.
¾ Determines if the mode register or extended mode register is
being accessed during an MRS or EMRS cycle.

Address (A 0-15)
¾ Input High
¾ Defines the Row Address for Active commands
¾ Defines the Column Address and auto precharge bit for
read/write commands.
¾ A10 is Row Address during an Active (Activate) command.
¾ A10 is sampled during a Precharge command.
¾ A10 Low = Precharge one bank
¾ A10 High = Precharge all banks
¾ A10 is Auto Precharge (AP) during a Read or Write command.
¾ A10 Low = No Precharge
¾ A10 High lecroy.com)
Min Huang(min.huang@ = Auto Precharge when command is complete.
Do Not Distribute .com © 2009
DDR2 Module Pin Description 93

Data Input/Output (DQ 0-63)


¾ Bi-directional data bus.
¾ DQ is a bused signal so every rank is seen as a load on the bus.

Data Strobes (DQS, DQS#, UDQS, UDQS#, LDQS, LDQS#,


RDQS, RDQS#) 0-17, 0-17#
¾ Bi-directional Differential in some cases (EMRS Programming)
¾ Used to latch data signals at the receiver.
¾ Edge-aligned with data (DQ) during reads (DRAM driving)
¾ Center-aligned with data (DQ) on writes (Chipset driving)
¾ For x4 devices, 16 or 18 pairs are used, one pair per chip (DQS,
DQS#)
¾ For x8 devices, 8 or 9 pairs are used, one pair per chip (DQS,
DQS#), except that in RDQS mode, 16 or 18 pairs are used
(DQS, DQS#, RDQS, RDQS#)
¾ For x16 devices, 8 or 9 pairs are used, 2 pairs per chip (LDQS,
LDQS#, UDQS, UDQS#). For x16, the ECC chip is x16, not x8.
¾ DQS is bused, so every rank is seen as a load on the bus.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Redundant DQS 94

DIMM
DQ 0-3
DQ 0-3
DQS0
DQS x4 DRAM
DQS0#
DQS#

DQ 4-7
DQ 0-3
DQS9 RDQS balances the
DQS x4 DRAM
DQS9# load on DQS 9-17 for
Both DIMMs DQS# channels that have
on same DIMMs with x4
channel devices as well as
DIMMs with x8
DIMM
devices.
DQ 0-7
DQ 0-7
DQS0
DQS
DQS0#
DQS# x8 DRAM
DQS9
RDQS
DQS9#
RDQS#
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Module Pin Description 95

Data Mask (DM 0-7, UDM, LDM)


¾ Input High
¾ Input mask signal for write data. When DM is high, write
DQS is ignored. When DM is low, DQS is valid.
¾ Same pins as DQS 9-17 on DIMMs.
¾ Pins are DM for DIMMs using x8 and x16 devices
¾ Pins are DQS for DIMMs using x4 devices. DM of x4 devices
are tied low on DIMMs.
¾ For x16 devices, these signals are named UDM and LDM.
On DIMMs with x16 devices, there is an unconnected set of
DM pins, which are the pins that would have been the DQS
of the x4 devices.
¾ DM is bused, so every rank is seen as a load on the bus.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 Module Pin Description 96

Check Bits (CB 0-7)


¾ Bi-directional data bus.
¾ Check bits used for ECC.
¾ CB 0-7 are strobed with DQS 8 on x8 devices, and with DQS
8 and 17 on x4 devices.
Par_In
¾ Parity for the address and control bus. Not used on most
Intel controllers. Implemented in the chipset and register.
Err_Out#, QERR#
¾ Indicates parity error found on the address or control bus.
¾ QERR# is register pin. Err_Out# is DIMM pin.
Serial Presence Detect (SPD) Clock Input (SCL)
¾ Serial clock used to synchronize the SPD logic.
SPD Data (SDA)
¾ Bidirectional pin to transmit address and data in and out of
the SPD.
SPD Address (SA 0-2)
Min Huang(min.huang@ lecroy.com)
¾ Must be hardwired to different addresses for each module.
Do Not Distribute .com © 2009
DDR3 Pin Additions/Changes 97

RESET# - Active when low


¾ CKE must be pulled low first.
¾ When RESET# is deasserted, CKE may be deasserted 500uS
later. During this time internal initialization is started.
¾ RESET# forces DRAM into a defined state.
¾ RESET# can be asserted asynchronously during any operation.
¾ Data in DRAM may be lost and DRAM must be re-initialized,
which includes (but not limited to) load mode registers and DLL
reset.
A12/BC# Burst Chop - Active when low
¾ BC# during Read and Write commands.
¾ A12 during Active (Activate) command.
TDQS/TDQS# - Termination Data strobe
¾ Only applicable to x8 DRAMs (replaces RDQS/RDQS#).
VREFDQ, VREFCA – Separate VREF for DQ and CMD/ADDR
¾ This decoupling reduces noise on the reference planes.
ZQ – Reference pin for ZQ calibration
MinCK,CK#
Huang(min.huang@
- Unbufferedlecroy.com)
DDR3 DIMMs only have 2 pairs instead of 3.
Do Not Distribute .com © 2009
Commands and Waveforms

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DDR3 Bank States 99

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 Commands 100

CKE BA2 A13


A10
Command Previous Current CS# RAS# CAS# WE# BA1 A12 A9-0
AP
Cycle Cycle BA0 A11

Mode Register Set H H L L L L MRn Op Code


Refresh H H L L L H X X X X
Self Refresh Entry H L L L L H X X X X
Self Refresh Exit L H L H H H X X X X
Single Bank H H L L H L BA X L X
Precharge
All Banks Precharge H H L L H L X X H X
Bank Activate H H L L H H BA Row Address
Write H H L H L L BA Col L Col
Addr Addr
Write with Auto H H L H L L BA Col H Col
Precharge Addr Addr
Read H H L H L H BA Col L Col
Addr Addr
Read with Auto H H L H L H BA Col H Col
Precharge Addr Addr
No Operation H X L H H H X X X X
Device Deselect H X H X X X X X X X
Power Down Entry H L L H H H X X X X

Min Huang(min.huang@
Power Down Exit L lecroy.com)
H L H H H X X X X
Do Not Distribute .com © 2009
DDR3 Command Additions 101

CKE A13
BA0- A12 A10 A0-9,
Command Previous Current CS# RAS# CAS# WE# A14
BA3 BC# AP A11
Cycle Cycle A15
Write BL= 4 (Burst H H L H L L BA CA L L CA
Chop)
Write BL= 8 H H L H L L BA CA H L CA

Write with Auto H H L H L L BA CA L H CA


Precharge BL=4
Write with Auto H H L H L L BA CA H H CA
Precharge BL=8
Read BL= 4 (Burst H H L H L H BA CA L L CA
Chop)
Read BL= 8 H H L H L H BA CA H L CA

Read with Auto H H L H L H BA CA L H CA


Precharge BL=4
Read with Auto H H L H L H BA CA H H CA
Precharge BL=8
ZQ Calibration Long H H L H H L X X X H X
ZQ Calibration Short H H L H H L X X X L X

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Initialization Commands 102

¾ Mode Register Set – Sometimes referred to


as Load Mode Register.
¾ Mode Register Set issued to load the MRS
and EMRS values via the Bank Address and
Address lines.
¾ The MRS command can only be used when
all banks are idle. No command can be
issued after MRS command until tMRD is met.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Mode Register Set Waveforms 103

T0 T T T T T T T
CK#

CK

EMR w/ MR w/
A[13:0] A10 = 1 EMR(2) EMR(3) DLL
Enable
DLL
Reset
A10 = 1

COMMAND NOP PRE MRS* MRS MRS MRS PRE

DQS/DQS#
(Hi-Z)

DQ
200us (min) 400ns (min)
(Hi-Z) (Power-up,
VDD and stable CK)

*MRS = Mode Register Set Command used to initialize SDRAM registers


4 SDRAM registers to initialize consist of: MR (Mode Register), EMR (Extended-MR), EMR(2) and EMR(3)

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
MRS Waveforms cont. 104

T T T T T T
CK#

CK

MR w/o
A[13:0] DLL
Reset
A10 = 1

COMMAND NOP NOP MRS NOP NOP PRE

DQS/DQS#
(Hi-Z)

DQ Normal Operation
(Hi-Z) 200 CK clocks to Normal Operation

OCD = Off-Chip Driver Impedance Calibration.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Activate Command 105

¾ The Activate command is used to open (activate)


a row in a particular bank for a subsequent
access.
¾ The values of BA0 through BA2 select the bank to
be activated.
¾ The row remains active for accesses until a
Precharge (or Read or Write with Autoprecharge)
is issued to that bank.
¾ Only one row per bank can be open at a time.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Typical Timings Shown 106

¾ There are typically 3 timings that are referred to for DIMM speed
besides the frequency
¾ CL - Column Address Strobe Latency is the amount of time in base
clocks from when CAS is asserted until data should be valid.
¾ DDR2 requires the clock interval to be in whole clocks.
¾ CAS latency is a function of the DRAMs internal speed. The
faster the DRAM the lower the CAS Latency.
¾ RCD – RAS-to-CAS Delay is the time in base clocks required from
an Activate to a Read or Write.
¾ RP – Time in base clocks required to precharge or write back a
row.
¾ Example: DIMM might say on it 4 – 4 – 4
CL – RCD – RP

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Activate Waveform 107

T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#

CK

A[13:0] ROW A ROW B COL

BA[2:0] BA 0 BA 1 BA 1

COMMAND ACTIVE NOP NOP ACTIVE NOP NOP READ NOP NOP

RRD=3
DQS/DQS# RCD=3
(from SDRAM)

DQ
(from SDRAM)
tRRD is the Minimum time interval from one Bank Activate to another Bank (RAS to RAS delay)
tRCD is the Minimum time from activate to a read or write command (RAS to CAS delay)

DDR2 400 3 - 3 - 3
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Read Command 108

¾ The Read command is used to issue a burst read


access to an active row. The value of BA0 through
BA2 will determine the bank to be accessed.
¾ A0-Ax provide the column address.
¾ The value on A10 during the Read command
determines whether or not to use Auto Precharge.
¾ If Auto Precharge is selected, the row currently
active will be precharged at the end of the burst.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR1, DDR2, DDR3 Burst Orientation 109

¾ All DDR SDRAM (DDR1, DDR2, DDR3) are burst oriented.


¾ Accesses start at a selected location and continue for a
programmed number of locations in a programmed sequence.
¾ Two parameters define burst operation:
¾ Burst Type: Interleaved or Sequential
¾ Burst Length: 2, 4, or 8
¾ DDR1 supports burst of 2, 4, or 8.
¾ DDR2 supports burst of 4 or 8 only.
¾ DDR3 supports burst of 8 with option to chop to 4.
¾ When issuing back-to-back bursts of 4, CAS# can be asserted
every other clock. In back-to-back bursts of 8, CAS# can be
asserted every 4th clock.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Burst Type and Length 110

¾ Programmed during initialization


¾ Burst type is programmed to be sequential or interleaved by bit A3
in the Mode Register (MR0).
¾ Interleaved is used on the FSB of all Intel chipsets and is also
referred to as toggle mode.
¾ Sequential mode is used in all other PC-based designs, i.e., AMD.
¾ Burst length is programmed by A0-A2 in MR0 and determines the
maximum number of column locations that can be accessed for a
Read or Write command.
¾ All accesses for a burst take place within a block. The burst wraps
within the block if a boundary is reached.
¾ The least significant address bit(s) is (are) used to select the
starting location within the block.
¾ BL/2 is the minimum number of clocks to wait until the next CAS
command

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 Burst Order 111

Starting Order of Access within a Burst


Burst Length Column Burst Type = Sequential Burst Type = Interleaved
Address (A2,A1,A0) (Linear wrap) (Toggle Intel)
000 0,1,2,3 0,1,2,3
001 1,2,3,0 1,0,3,2
4
010 2,3,0,1 2,3,0,1
011 3,0,1,2 3,2,1,0
000 0,1,2,3,4,5,6,7 0,1,2,3,4,5,6,7
001 1,2,3,0,5,6,7,4 1,0,3,2,5,4,7,6
010 2,3,0,1,6,7,4,5 2,3,0,1,6,7,4,5
011 3,0,1,2,7,4,5,6 3,2,1,0,7,6,5,4
8
100 4,5,6,7,0,1,2,3 4,5,6,7,0,1,2,3
101 5,6,7,4,1,2,3,0 5,4,7,6,1,0,3,2
110 6,7,4,5,2,3,0,1 6,7,4,5,2,3,0,1
111 7,4,5,6,3,0,1,2 7,6,5,4,3,2,1,0

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Burst Order 112

Order of Access within a Burst


Starting Column
Burst Length Read/ Write Burst Type = Sequential Burst Type = Interleaved
Address (A2,A1,A0)
(Linear wrap) (Toggle)
000 0,1,2,3,Z,Z,Z,Z 0,1,2,3,Z,Z,Z,Z
001 1,2,3,0,Z,Z,Z,Z 1,0,3,2,Z,Z,Z,Z
010 2,3,0,1,Z,Z,Z,Z 2,3,0,1,Z,Z,Z,Z
011 3,0,1,2,Z,Z,Z,Z 3,2,1,0,Z,Z,Z,Z
Read
100 4,5,6,7,Z,Z,Z,Z 4,5,6,7,Z,Z,Z,Z
4 Chop
101 5,6,7,4,Z,Z,Z,Z 5,4,7,6,Z,Z,Z,Z
110 6,7,4,5,Z,Z,Z,Z 6,7,4,5,Z,Z,Z,Z
111 7,4,5,6,Z,Z,Z,Z 7,6,5,4,Z,Z,Z,Z
0VV 0,1,2,3,X,X,X,X 0,1,2,3,X,X,X,X
Write
1VV 4,5,6,7,X,X,X,X 4,5,6,7,X,X,X,X
000 0,1,2,3,4,5,6,7 0,1,2,3,4,5,6,7
001 1,2,3,0,5,6,7,4 1,0,3,2,5,4,7,6
010 2,3,0,1,6,7,4,5 2,3,0,1,6,7,4,5
011 3,0,1,2,7,4,5,6 3,2,1,0,7,6,5,4
Read
8 100 4,5,6,7,0,1,2,3 4,5,6,7,0,1,2,3
101 5,6,7,4,1,2,3,0 5,4,7,6,1,0,3,2
110 6,7,4,5,2,3,0,1 6,7,4,5,2,3,0,1
111 7,4,5,6,3,0,1,2 7,6,5,4,3,2,1,0
Write VVV 0,1,2,3,4,5,6,7 0,1,2,3,4,5,6,7

Min Huang(min.huang@ lecroy.com)


Do Not Distribute
V = Valid (stable) 0 or 1
.com © 2009
Read Burst without Additive Latency 113

T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#

CK

A[13:0] ROW COL

COMMAND ACTIVE NOP NOP READ NOP NOP NOP NOP NOP

Preamble
RCD=3
DQS/DQS#
(from SDRAM)
CL= 3 D D D D
DQ 0 1 6 7
RL= 3
(from SDRAM)
Burst Length = 8

CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
RL = Read Latency
Burst Length = 8
DDR2 400 3 - 3 - 3
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Read Burst with Additive Latency 114

T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#

CK

A[13:0] ROW COL

COMMAND ACTIVE READ NOP NOP NOP NOP NOP NOP NOP

Preamble
RCD=3
DQS/DQS#
(from SDRAM) AL=2
CL=3 D D D D
DQ 0 1 6 7
RL= 5
(from SDRAM)
Burst Length = 8

CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
RL = Read Latency
Burst Length = 8

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Why Have Additive Latency? 115

DDR2 with AL=0, CL=4


T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15
CK#

CK
tRRD
CMD/ ACT ACT RD AP ACT RD AP RD AP
ADD B0, Rx B1, Rx B0, Cx B2, Rx B1, Cx B2, Cx

tRCD CAS Latency (CL) Read Data Read Data Read Data
Data

Gap in Data due to


scheduling conflict

¾ The Read Command for Bank0 was sent during cycle T4.
¾ The memory controller would have liked to send the Activate Bank2 command
on cycle T4 instead of T5.
¾ Due to this delay in activating Bank2, it results in a gap in the data stream
being returned by the SDRAM because of this scheduling conflict.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Why Have Additive Latency? 116

DDR2 with AL=3, CL=4


T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15
CK#

CK
tRRD
CMD/ ACT RD AP ACT RD AP ACT RD AP
ADD B0, Rx B0, Cx B1, Rx B1, Cx B2, Rx B2, Cx

Additive Latency (AL) CAS Latency (CL) Read Data Read Data Read Data
Data
Read Latency (RL)

No Data Gap

¾ DDR2 SDRAM can queue commands and schedule them at the appropriate
time based on the programmed value of Additive Latency (AL).
¾ The memory controller no longer has a scheduling conflict because the
commands can be sent to SDRAM back-to-back and the SDRAM will schedule
them at the appropriate time.
¾ The data gap seen in the previous example no longer exists.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Read Burst Consecutive 117

T0 T1 T2 T3 T4 T5 T6 T7 T8
CK#

CK

A[13:0] COL COL

COMMAND NOP READ NOP READ NOP NOP NOP NOP NOP

Preamble
CL=3
DQS/DQS#
(from SDRAM)
D D D D D D D D
DQ 0 1 2 3 0 1 2 3

(from SDRAM)
Burst Length = 4 Burst Length = 4

CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
Burst Length = 4

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Write Command 118

¾ The Write command is used to issue a burst write to an


active row.
¾ The value of BA0 through BA2 will determine the bank to be
accessed. A0-Ax provide the column address.
¾ The value on A10 determines whether or not Auto Precharge
is to be used. If Auto Precharge is selected the row currently
accessed will be precharged at the end of the burst.
¾ Write data appearing on the DQ pins are written to the
memory array if DM is registered low. If DM is registered
high the data is ignored.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Write Waveform 119

T0 T1 T2 T3 T4 T5 T8 T9 T10
CK#

CK

A[13:0] ROW COL

COMMAND ACTIVE NOP NOP WRITE NOP NOP NOP NOP NOP

Preamble
RCD=3
DQS/DQS#
(from MC)
CL-1=2
D D D D
DQ and DM 0 1 6 7

(from MC)
Burst Length = 8

CL = CAS# Latency
AL = Additive Latency
RCD = RAS#-to-CAS# delay
WL = Write Latency
Burst Length = 8

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Precharge Command 120

¾ The Precharge command deactivates the open (active) row


in a particular bank, or in all banks.
¾ The bank(s) will be ready for an Activate command after tRP
is met. A10 determines whether all or one bank is to be
precharged.
¾ In the case where only one bank is to be precharged, BA0-
BA2 will select the bank. Once a bank is precharged it is in
the idle state.
¾ If no bank is active a Precharge is seen as a NOP.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Auto Precharge 121

¾ The same effect as with a separate Precharge


command, but doesn’t require a separate
command.
¾ Indicated by asserting A10 during a Read or
Write command.
¾ Auto Precharge ensures that the Precharge is
going to happen at the earliest possible time after
the Read or Write command completes.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Read with Auto Precharge 122

T0 T1 T2 T3 T4 T5 T6 T9 T10
CK#

CK

A[13:0] COL ROW

READ
COMMAND NOP NOP NOP NOP ACTIVE NOP NOP
AP

Preamble
CL=3
DQS/DQS#
(from SDRAM) RTP=2 RP=3
D D D D D D D D
DQ 0 1 2 3 4 5 6 7

(from SDRAM)

DDR2 400 3 - 3 - 3
tRP is the time required to internally precharge an active Row until the next command
tRTP is the time from Read to Precharge
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Types of Refresh and Some History 123

Refresh
Controller told async DRAM chips which row to refresh.
Auto Refresh
SDRAM feature. CAS-before-RAS refresh.
Row counters are inside the SDRAM chips.
Controller only tells chips when to refresh, not which row.
Simply called “refresh” now.
Self Refresh
System powered off; DRAM fully powered.
Refresh interval timer inside SDRAM chips.
Interval is fixed before starting self refresh.
Auto Self Refresh
Self refresh where DDR3 chips automatically choose their
refresh interval based on their temperature.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Refresh Commands 124

¾ Refresh is used in normal operation and is the


same as CAS-before-RAS (CBR) refresh in older
DRAM.
¾ This command is non-persistent and must be
issued every time a refresh is required. No
address is required.
¾ The address is internally generated.
¾ Refresh can be posted up to 8 times. This means
the max time can be as much as 9 x tREFI. That is
the Refresh Interval (tREFI) x 8 plus the current
Refresh command (giving 9).
¾ No more than 16 Refresh commands may be
issued within 2 x tREFI.
¾ The minimum time between Refresh commands is
Refresh Cycle Time (tRFC).
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Refresh Commands 125

¾ The refresh period is 64ms (32ms for high temp) which


equals one refresh every 7.8us (tREFI) for a device with 8192
rows.
¾ A single refresh command may refresh one or more Rows.
For example here are the refresh requirements for different
densities.
¾ tRFC Refresh Cycle Time is how long it takes for the DRAM to
finish the refresh command.

Estimated number of Rows


Density DDR2 tRFC Refreshed, depending on
organization
256Mb 75ns 1 (8192 Rows)

512Mb 75-105ns 1-2 (8192,16384 Rows)

1Gb 105-127.5ns 1-2 (8192,16384 Rows)

2Gb 127.5-197.5ns 2-4 (16384, 32768 Rows)

Longest
Min Huang(min.huang@ DDR3 tRFC is 350ns
lecroy.com)
Do Not Distribute .com © 2009
Refresh Waveforms 126

T0 T1 T2 T3 T4 T5 T6 TA TB
CK#

CK

CKE

A[13:0] A10 *1

COMMAND NOP PRECRG NOP NOP REF NOP NOP NOP NOP

RP=3
DQS/DQS#
(from SDRAM) tRFC=75nS

DQ
(from SDRAM)

*1 A10 must be high for more than 1 bank to get Precharged. Precharge all must be
done before entering the Refresh mode.
tRFC is the refresh cycle time

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Self Refresh 127

¾ Self Refresh is the same command as Refresh except CKE is low.


This command is used to keep data integrity while the rest of the
system is powered down. The DLL is automatically disabled going
into self refresh and automatically enabled coming out of a Self
Refresh state. CKE must stay low during Self Refresh.
¾ Here is the procedure for coming out of Self Refresh:
¾ CK and CK# must be stable before CKE goes high.
¾ CKE goes high
¾ NOP command issued for tXSNR (Exit Self Refresh to a Non
Read Command) because time is required for internal
refreshes to complete
¾ Refresh command is recommended
¾ No DLL reset is required

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Self Refresh Waveform 128

T0 T1 T2 T3 T4 T5 T6 TA TB
CK#

CK

CKE

A[13:0]

COMMAND NOP PRECRG NOP NOP REF NOP NOP NOP NOP

RP=3
DQS/DQS#
(from SDRAM)

DQ
(from SDRAM)

All banks must be precharged before entering the Self Refresh mode.
Refresh with CKE low will cause the DRAM to go into a Self Refresh state.
tRP is the time required to internally precharge an active Row until the next command

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
NOP and Deselect 129

¾ No Operation - NOP command is used to perform


a NOP to the selected DRAM (DRAM whose CS#
= low). Operations in progress are not affected.
¾ Deselect - Deselect (CS# = high) prevents new
commands from being executed. DRAM is
effectively disabled.
¾ Prevents unwanted commands from being registered in
an idle state.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Power Down 130

¾ Power Down – Also called CKE power down. CKE


is brought low synchronously with CS and the
other commands high. This low power state can
only be held for 8 X tREFI of the device because no
refreshes can happen in this state. The 8 X tREFI is
due to the fact that refresh can be posted up to 8
times.
¾ Clock must be active in this state.
¾ Fast exit refers to leaving the DLL on and Slow
Exit turns the DLL off. If Slow exit is used the DLL
must relock.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Power Down 131

T0 T1 T2 T3 T4 T5 T6 TA TB
CK#

CK

CKE

A[13:0]

COMMAND NOP PRECRG NOP NOP NOP NOP NOP NOP NOP

RP=3
DQS/DQS#
tREFI X 8
(from SDRAM)

DQ
(from SDRAM)

Precharge Power Down shown. If any banks are left open, this becomes Active Power Down.

tREFI is Refresh interval time times the 8 posted refreshes until exit of Power Down is required.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Address and Command Timing 132

¾ 1T and 2T timings
¾ Depending on how many DIMMs the system needs to
support, the DRAM controller may use one of 2 different
address and command timing schemes.
¾ 1T: The address and command signals are held active by
the controller for 1 clock. This allows for faster turnaround
times.
¾ 2T: The address and command signals are held active by
the controller for 2 clocks. For systems with more loads, this
allows for longer setup and hold times. Control signals (CS#,
CKE, ODT) must still obey 1T timing.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
2T Timing Example 133

T0 T1 T2 T3 T4 T5 T6 T7 T11
CK#

CK

A[13:0] ROW COLUMN

COMMAND ACTIVE READ NOP NOP

CS#
Preamble
RCD=3
DQS/DQS#
(from SDRAM) AL=1 CL=3
D D D
DQ 0 1 7

(from SDRAM)
Burst Length=8

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR Initialization

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DDR Initialization Routine 135

¾ DDR SDRAMs must be powered up and


initialized in a very specific sequence to
ensure stable working parts.
¾ Every time power is lost the procedure must
be repeated.
¾ The initialization code is typically held in
firmware and is often referred to as JEDEC
initialization.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization 136

¾ Step 1
¾ Apply power. Keep CKE below 0.2 x VDDQ and ODT
LOW. All other inputs may be undefined.
¾ Refer to the JESD79-2C standard for details about the
allowable voltage ramp times.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 137

¾ Step 2
¾ Start clock and maintain stable condition with CKE
held low.

¾ Step 3
¾ For the minimum of 200 us after stable power and
stable clock (CK, CK#), then apply NOP or
Deselect & take CKE HIGH.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 138

¾ Step 4
¾ Wait minimum of 400 ns then issue precharge all
command. NOP or Deselect applied during 400 ns
period.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 139

¾ Step 5
¾ Issue MRS command to EMR2.

¾ Step 6
¾ Issue MRS command to EMR3.

¾ Let’s see what’s in the mode registers and also see


how we learn what values to program into the mode
registers.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 140

¾ Mode Register (MR) & Extended Mode Register (EMR)


¾ The mode registers are used to define the specific mode of
operation of the SDRAM
¾ The default value for these registers is not defined.
¾ Mode registers will retain all information until rewritten or power is
removed from the device except bit A8 (DLL reset) which is self
clearing
¾ Reprogramming the mode registers during operation will not alter
the information held in the DRAM
¾ The Bank Address bits are used to determine which mode register
is being written to.
¾ The Address bus is used to write to each of the registers.
¾ Why is the Address bus used to program Mode Registers??

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization MR0 141

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization EMR1 142

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization EMR2 143

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization EMR3 144

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 145

The information required from each DIMM is


held in the SPD.

SPD
DRAMs

Substrate

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 146

¾ Prior to JEDEC initialization, the BIOS/Firmware has


read all data from SPD and worst-case timings have
been calculated. Example:
¾ If two DIMMs are installed where one has a CAS latency of 2
and one with a CAS latency of 2.5, then CAS latency 2.5 will
be used for both DIMMs.
¾ A DLL reset will be performed during this step.
Anytime DLL reset occurs 200uS must be provided
for it to relock before issuing a read command.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
SPD Data 147

BYTE TITLE VALUE


This is an example of the data 0 NUMBER OF BYTES USED BY MICRON 80
1 TOTAL NUMBER OF SPD MEMORY BYTES 08
stored in the SPD. 2 MEMORY TYPE 07
¾ The first 128 bytes are 3
4
NUMBER OF ROW ADDRESSES
NUMBER OF COLUMN ADDRESSES
0D
0C
defined by the memory 5 NUMBER OF MODULE RANKS 02
manufacturer. 6 MODULE DATA WIDTH 48
7 MODULE DATA WIDTH (CONTINUED) 00
¾ The last 128 bytes can be 8 MODULE VOLTAGE INTERFACE LEVELS 04
9 DDR SDRAM CYCLE TIME (CAS LATENCY = 2.5) 75
used by the customer. 10 DDR SDRAM ACCESS FROM CLOCK (CAS LATENCY = 2.5) 75
11 MODULE ERROR CORRECTION CONFIGURATION TYPE 02
¾ Each SPD has a unique 12 MODULE REFRESH RATE AND TYPE 82
SMBus address hardwired on 13 SDRAM DEVICE WIDTH 04
14 ERROR CHECKING WIDTH 04
the motherboard typically on 15 MIN CLOCK DELAY FOR BACK-TO-BACK RANDOM COLUMN ADDR 01
the lower 3 address bits of the 16 BURST LENGTHS SUPPORTED 0E
17 NUMBER OF BANKS INTERNAL TO DISCRETE SDRAM DEVICES 04
SMBus. 18 CAS LATENCIES SUPPORTED 0C
¾ The SMBus can attach to the 19 CS LATENCY 01
20 WE LATENCY 02
MCH or the ICH. 21 SDRAM MODULE ATTRIBUTES 26
22 SDRAM DEVICE ATTRIBUTE: GENERAL C0
23 DDR SDRAM CYCLE TIME (TCK) AT CL = 2 A0
24 DDR SDRAM ACCESS FROM CLOCK (TAC) AT CL = 2 75
25 DDS SDRAM CYCLE TIME (TCK) AT CL = 1 00
26 DDR SDRAM ACCESS TIME FROM CLOCK (TAC) AT CL = 1 00
27 SDRAM: MINIMUM ROW PRECHARGE TIME (TRP) 50
28 MINIMUM ROW ACTIVE TO ROW ACTIVE 3C
29 MINIMUM RAS TO CAS DELAY 50

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
SPD Data cont. 148

BYTE TITLE VALUE


30 MINIMUM RAS PULSE WIDTH 2D
31 MODULE RANK DENSITY 01
32 COMMAND/ADDRESS SETUP A0
33 COMMAND/ADDRESS HOLD A0
34 DATA SIGNAL INPUT SETUP 50
35 DATA SIGNAL INPUT HOLD 50
36-40 RESERVED 5 0000000000
41 DEVICE MINIMUM ACTIVE/AUTO-REFRESH TIME (TRC) 41
42 DEVICE MINIMUM AUTO-REFRESH TO ACTIVE/AUTO-REFRESH 4B
43 DEVICE MAXIMUM DEVICE CYCLE TIME (TCK MAX) 34
44 DEVICE DQS-DQ SKEW FOR DQS AND ASSOCIATED DQ SIGNAL 32
45 DEVICE READ DATA HOLD SKEW FACTOR (TQHS) 75
46 RESERVED (BYTE 46) 00
47 DIMM HEIGHT 01
48-61 RESERVED BYTES 48-61 0000…00
62 SPD REVISION 10
63 CHECKSUM FOR BYTES 0 THRU 62 EB
64 MANUFACTURER’S JEDEC ID CODE 2C
65-71 MANUFACTURER’S JEDEC ID CODE (CONTINUED) FFFFFFFFFFFFFF
72 MANUFACTURING LOCATION 00
73-90 MODULE PART NUMBER 36VDDF25672G265C2
91 PCB IDENTIFICATION CODE 02
92 PCB IDENTIFICATION CODE (CONTINUED) 00
93 YEAR OF MANUFACTURE 00
94 YEAR OF MANUFACTURE 00
95-98 MODULE SERIAL NUMBER 00000000
99-127 MANUFACTURER SPECIFIC DATA (RSVD) 99-127 0000…00
128-191 UNUSED FFFF…FF
192-255 UNUSED2 FFFF…FF

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 149

¾ Step 7
¾ Issue MRS command to EMR1 to enable DLL.

¾ Step 8
¾ Issue MRS command to MR0 to reset DLL.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 150

¾ Step 9
¾ Issue a Precharge All command.

¾ Step 10
¾ Issue 2 or more Refresh commands.

¾ Step 11
¾ Issue a MRS command to MR0 with LOW to A8 to
program the desired device operation without
resetting the DLL.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR2 Initialization cont. 151

¾ Step 12
¾ At least 200 clocks after resetting the DLL,
execute OCD Calibration (Off Chip Driver
impedance adjustment). If OCD calibration is not
used, issue a MRS command to EMR1 to set OCD
Calibration Default followed by issuing a MRS
command to EMR1 to exit OCD Calibration Mode
while also setting other operating parameters of
EMR1.

¾ Done!
¾ The standard says the DRAM is now ready for
normal operation, but the controller still needs to
train the timing.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization 152

¾ Step 1
¾ Apply power. Assert RESET# for at least 200 us with stable power.
RESET# is recommended to be less than 0.2 x VDD. All other inputs may
be undefined. Pull CKE Low at least 10 ns anytime before deasserting
RESET#. The VDD ramp time between 300 mv to VDDmin must be no
more than 200 ms. During the ramp, VDD > VDDQ and (VDD - VDDQ) <
0.3 volts.
¾ The voltage levels on all pins other than VDD, VDDQ, VSS, VSSQ must
be less than or equal to VDDQ and VDD on one side and must be larger
than or equal to VSSQ and VSS on the other side.

¾ VDD and VDDQ are driven from a single power converter output, AND
¾ VTT is limited to 0.95 V max once power ramp is finished, AND
¾ Vref tracks VDDQ/2.
OR
¾ Apply VDD, without any slope reversal, with or before VDDQ.
¾ Apply VDDQ, without any slope reversal, with or before VTT & Vref.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 153

¾ Step 2
¾ After RESET# is de-asserted, wait 500 us until CKE
becomes active. During this time, the DRAM will start
internal state initialization independently of external
clocks.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 154

¾ Step 3
¾ Start and stabilize CK and CK# for at least 10 ns or 5
tCK (whichever is larger) before CKE goes active.
Since CKE is a synchronous signal, the
corresponding set up time to clock (tIS) must be met.
Also, a NOP or Deselect command must be
registered (with tIS set up time to clock) before CKE
goes active. Once the CKE is registered “High” after
RESET#, CKE needs to be continuously registered
“High” until the initialization sequence is finished,
including expiration of tDLLK and tZQinit.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 155

¾ Step 4
¾ The DDR3 SDRAM keeps its on-die termination
(ODT) in high-impedance state as long as RESET# is
asserted. Further, the SDRAM keeps its ODT in high
impedance state after RESET# deassertion until CKE
is registered “High”. The ODT input signal may be in
undefined state until tIS before CKE is registered
“High”. When CKE is registered “High”, the ODT input
signal may be statically held at either “Low” or “High”.
If RTT_NOM is to be enabled in MR1, the ODT input
signal must be statically held “Low”. In all cases, the
ODT input signal remains static until the power up
initialization sequence is finished, including the
expiration of tDLLK and tZQinit.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 156

¾ Step 5
¾ After CKE is registered “High”, wait minimum of
Reset CKE Exit time, tXPR, before issuing the first
MRS command to load mode register. (tXPR=max
(tXS; 5 x tCK)

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Mode Registers 157

¾ DDR3 Mode Registers are numbered


consistently to the addressing:
¾ MR0: Mode Register 0
¾ MR1: Mode Register 1
¾ MR2: Mode Register 2
¾ MR3: Mode Register 3

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Mode Registers 158

Mode Register 0

New/changed features
compared to DDR2 are
marked in red.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR3 Mode Registers 159

Mode Register 1

New/changed features Output Driver strength removed


compared to DDR2 are
marked in red.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Mode Registers 160

Mode Register 2

4 Bank Partial Array was removed


Duty Cycle Control was removed

New/changed features
compared to DDR2 are
marked in red.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Mode Register 2 Changes 161

¾ Self refresh temperature range – Allows the


controller access to the check to see if the
DRAM is capable of extended temperature
ranges with a reduced refresh cycle time.
¾ Auto Self Refresh – a DDR3 option allowing
the DRAMs themselves to initiate refresh.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Mode Registers 162

Mode Register 3

New/changed features
compared to DDR2 are
marked in red.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 163

¾ Step 6
¾ Issue MRS Command to load MR2 with all
application settings.
¾ Step 7
¾ Issue MRS Command to load MR3 with all
application settings.
¾ Step 8
¾ Issue MRS Command to load MR1 with all
application settings and DLL enabled.
¾ Step 9
¾ Issue MRS Command to load MR0 with all
application settings and “DLL reset”.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Step by Step DDR3 Initialization cont. 164

¾ Step 10
¾ Issue ZQCL command to start ZQ calibration.
¾ Step 11
¾ Wait until both tDLLK and tZQinit complete.

¾ Done!
¾ The standard says the DRAM is now ready for normal
operation, but the controller still needs to train the
timing. This is covered in the Read Calibration and
Write Leveling sections later.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
SMBus Overview

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
SMBus Overview 166

SMBus is a 2-wire Multi-Master and Multi-Slave


device interface based on the OSI model used
in high speed network protocols.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
SMBus Features 167

¾ Bus masters initiate transactions to slave


devices and provide the clock.
¾ 7-bit addresses (DRAM usually uses the
lower 3 bits for its hardwired DIMM SPD
addresses in a PC. The upper 4 bits are
usually Ah, 1010b)
¾ Min speed is 10KHz and Max is 100KHz
¾ Devices may be powered by 3 or 5 volts.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
SMBus Architecture 168

¾ The three layers pertinent to reading the SPD:


¾ Network Layer
¾ Data Link Layer
¾ Physical Layer
¾ To read the SPD through an operating system an
application will be needed to access the Bus Master
that is attached to the SPDs.
¾ This Bus Master may reside in the MCH or ICH and
there may be multiple Masters, such as for multiple
memory channels.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
System SMBus Layout 169

Pentium 4 Pentium 4
Processor Processor
SMBus

PSB
x16 PCI Express
GFX 4 6
SA0 SA0
SA1 SA1 DDR2
GFX Root Complex SA2 SA2

(MCH 955) SDRAM


0 2
SA0 SA0
SA1 SA1
SA2 SA2
DMI/x4 PCI Express Typically wired at
4, Serial ATA PCI
the connector to
HDD 0, 2, 4 and 6.
IDE 0/1
CD HDD 4+2, x1 PCI Express
IO Controller Hub
8, USB 2.0 (ICH 7)
Clocks, Power Management
LPC SPI
SPI
BIOS
S Boot AC’97 Intel
GPIOs
I ROM HD Audio
COM1 O LAN
COM2
Modem Audio
Codec Codec

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Physical Layer 170

¾ The Physical layer consists of the Drive and Receive


buffer for the clock and data.
¾ When the bus is idle SMBCLK is at 0KHz and pulled high.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Data Link Layer – Data Valid 171

¾ The SMBus uses level-sensitive logical high and low signaling.


¾ SMBDAT must be stable when SMBCLK is high.
¾ Transitions only occur when SMBCLK is low.
¾ Data is latched only when SMBCLK is high.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Data Link Layer – Start and Stop 172

¾ Definition of a start and stop condition:


¾ Start (S) is indicated by a high to low transition on SMBDAT while
SMCLK is high. Once the Start transaction has occurred the bus is
in the Busy state.
¾ Stop (P) is indicated by a low to high transition on SMBDAT while
clock is high. Start and Stop are only initiated by a Bus Master.
After the Stop transaction the bus is the Idle state.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Data Link Layer – Packet 173

¾ A generic packet is made up of Start and


Stop bits, Slave Address, Data Byte,
Command bit and Ack/Nack bits.

1 7 1 1 8 1 1

S SLAVE ADDRESS CMD A DATA BYTE A P

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Data Link Layer – Ack/Nack 174

¾ Ack/Nack overview:
¾ An Ack or Nack must be generated for every byte
that is transferred.
¾ The transmitter must release the SMBDAT during
the Ack/Nack clock period.
¾ An Ack will be signaled by asserting the SMBDAT
line low. If SMBDAT is not actively driven low the
external pull-ups will keep it high signaling a Nack.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Data Link Layer – Ack/Nack 175

¾ Ack/Nack overview:
¾ An Ack merely indicates that a byte was received
within the correct timing window. This does not
mean there was any error checking done like
Parity, ECC, or CRC.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Network Layer 176

¾ Usage Model Overview


¾ There are 3 types of devices
1. Master devices initiate transfers as well as drive
SMBCLK.
2. Slave devices act as receivers and respond to
commands.
3. Host devices are a Master and Slave
combination that act as the interface to the
systems CPU.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Packet Overview 177

1 7 1 1 8 1 1

S SLAVE ADDRESS CMD A DATA BYTE A P

S Start Condition
Sr Repeat Start Condition
Rd Read (bit value of 1)
Wr Write (bit value of 0)
A Acknowledge (this bit position may be 0 for Ack
and 1 for Nack)
P Stop Condition
Slave to Master
Master to Slave

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Example of Block Read 178

¾ Block Read starts write to the slave.


¾ The data payload is the address offset into the SPD.
¾ Now that the address pointer has been loaded the Start repeat
is used to continue
¾ Nack can be signaled before Stop to indicate the end of the
transfer.
¾ No shade indicates from Master. Shaded indicates from the
slave device.

1 7 1 1 8 1 1 7 1 1
S SLAVE ADDRESS WR A ADDRESS OFFSET A Sr SLAVE ADDRESS RD A …

8 8 1 8 1 1
DATA BYTE 1 A DATA BYTE 2 A …. DATA BYTE X N P

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Example of Write Byte 179

¾ The Master asserts the Slave address followed by


the write bit. The device acknowledges and the
Master sends the command code. The slave
acknowledges again and the Master transfers the
Byte/Word Low byte first to the device.
¾ No shade indicates from Master. Shaded indicates
from the slave device.

1 7 1 1 8 1 8 1 1
S SLAVE ADDRESS WR A COMMAND BYTE A DATA BYTE A P

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Other Commands 180

¾ Quick Command
¾ Send Byte
¾ Receive Byte
¾ Read Byte/Word
¾ Write Byte/word
¾ Block Read
¾ Block Write
¾ Process call
¾ Host Notify
¾ All of these can support the PEC packet (Packet
Error Checking)

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Electrical Specifications

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DDR Electrical Characteristics 182

¾ DDR is based on reference comparator input


logic. This logic is defined by the Stub Series
Terminated Logic (SSTL) spec for each of the
DDR voltage operational modes.
¾ For DDR2, JESD8-15A is the spec.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Electrical Overview 183

¾ Voltage requirements overview


¾ DDR1 is SSTL_2. (Stub Series Terminated Logic)
¾ Main power is VDDQ 2.5 Volts
¾ Ground is VSS
¾ VREF is VDDQ/2
¾ VTT and VREF are 1.25 Volts tracking with VDDQ

¾ DDR2 is SSTL_18. (Stub Series Terminated Logic)


¾ Main power is VDDQ 1.8 Volts
¾ Ground is VSS
¾ VREF is VDDQ/2
¾ VTT and VREF are 0.9 Volts tracking with VDDQ

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Comparator Logic 184

VDDQ
Voltage Crossing of Clock or Strobe
VIH(AC) Min

Data VIH(DC) Min


Receive Logic
Comparator
VREF VREF
VSWING
VIL(DC) Min

VIL(AC) Min

VSS

Delta TF Delta TR

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Current Parameters 185

¾ JEDEC specifies all of the states that vendors


must use to define the min and max current
drawn by their parts.
¾ Many of the vendors provide tools to help
specify the Total Design Power or TDP.
¾ The following is a list of all the parameters
that must be included.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
IDD Specifications 186

Symbol Conditions
IDD0 Operating one bank active-precharge current;
CKE is high, CS is high between valid commands;
Address bus inputs and switching;
Data bus inputs are switching
IDD1 Operating one bank active-read-precharge current;
BL (Burst Length)=4;
CKE is high, CS is high between valid commands;
Address bus inputs and switching;
Data bus inputs are switching
IDD2P Precharge power down current;
All banks idle;
CKE is low;
Address and control bus inputs are stable;
Data inputs are floating
IDD2Q Precharge quiet standby current;
All banks idle;
CKE is high, CS is high;
Address and control bus inputs are stable;
Data inputs are floating
IDD2N Precharge standby current;
All banks idle;
CKE is high, CS is high;
Address and control bus inputs are switching;
Data inputs are switching

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
IDD Specifications 187

Symbol Conditions
IDD3P Active power down current; (typically broken into fast or slow power down)
All banks open (active);
CKE is low;
Address and control bus inputs are stable;
Data inputs are floating
IDD3N Active standby current;
All banks open (active);
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching
IDD4W Operating burst write current;
BL (Burst Length) =4
All banks open (active), Continuous burst writes;
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching
IDD4R Operating burst read current;
BL (Burst Length) =4
All banks open (active), Continuous burst reads;
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
IDD Specifications 188

Symbol Conditions
IDD5B Burst refresh current;
Refresh command at every tRFC interval;
All banks open (active), Continuous burst reads;
CKE is high, CS is high between valid commands;
Address and control bus inputs are switching;
Data inputs are switching

IDD6 Self refresh current;


CK and CK# are at 0Volts;
CKE is less than or equal to 0.2 volts;
Address and control bus inputs are floating;
Data inputs are floating
IDD7 Operating bank interleave read current;
All banks interleaving reads;
BL (Burst Length)=4
CKE is high, CS is high between valid commands;
Address and control bus inputs are stable during deselects;
Data inputs are switching

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR1 and DDR2 Routing

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Design Challenges for DDR2 190

¾ There are many things to consider when


designing a DDR2 system.
¾ Source synchronous clocking schemes and
how to route them.
¾ ODT as well as Command and Address
termination
¾ Input delay to center DQS to data

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Basic Routing Rules 191

¾ Always refer to the chipset design guide for


lead-in length and DIMM-to-DIMM spacing.
¾ All Command and Address signals must be
end terminated as well as length-matched
within some tolerance. They are all in the
same time domain and are latched with clock.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Basic Routing of Data 192

¾ Data and strobe will be On-Die Terminated


¾ If the system supports x4 DRAMs, then each
group of 4 data lines and their associated
strobe must be length matched. These
signals are source synchronous.
¾ If system does not support x4 DRAMs, then
every byte lane needs to be length matched
to its associated strobe.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR1 and DDR2 Routing 193

¾ CMD and ADDR are


terminated on
Motherboard.
CPU ¾ Data Bus is terminated
on die.
¾ Byte lanes and their
associated DQS must
FSB

be length matched. For


x4 chips, nibbles must

Termination
be length matched with
Mem their DQS.
Ctrl ¾ Command and address
are also length matched
to ensure that all
DMI

commands will get to


each DRAM at the
same time.

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR2 Read Overview 194

¾ Traditional DDR1 and


DDR2 routing.
¾ Data Bus is flight-time
CPU matched for all data groups.
¾ Command and address are
also length matched to
FSB

ensure that all commands


will get to each DRAM at
exactly the same time.

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Blue is Address and Command Bus
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Read Overview 195

¾ Read Cycle in DDR2


¾ DRAM controller sends read
command to second DIMM
CPU ¾ Command propagates to all
DRAM with the same timing
due to the length matched
FSB

routing.

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Blue is Address and Command Bus
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Read Overview 196

¾ Read Cycle in DDR2


¾ DRAM controller sends read
command to second DIMM
CPU ¾ Command propagates to all
DRAM with the same timing
due to the length matched
FSB

routing.
¾ After RL, the DRAMs will
drive the requested data
back to the controller.
Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Blue is Address and Command Bus
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Read Overview 197

¾ Read Cycle in DDR2


¾ DRAM controller sends read
command to second DIMM
CPU ¾ Command propagates to all
DRAM with the same timing
due to the length matched
FSB

routing.
¾ After RL, the DRAMs will
drive the requested data
back to the controller.
Mem ¾ Read data arrives at the
Ctrl controller at the same time.
DMI

IO
Ctrl

Key: Red is Data Bus


Blue is Address and Command Bus
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Write Overview 198

¾ How does the memory controller know


¾ when to launch the write data to the clock
¾ and ensure tDSH, tDSS and tDQSS are satisfied when the write data
arrives at the DRAM?

¾ One way is to rely on the flight-time matching between the clock


routing and the data bus routing on the mother board and on the
DIMM.
¾ In other words, if the write data is launched correctly at the
controller, it hopefully will arrive correctly at the DRAM

¾ This works for DDR2 but not for DDR3

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR2 Write Overview 199

¾ Write Cycle in DDR2


¾ DRAM controller sends
Write command to second
CPU DIMM
¾ Command propagates to all
DRAMs with the same timing
FSB

due to the length-matched


routing.

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Blue is Address and Command Bus
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Write Overview 200

¾ Write Cycle in DDR2


¾ DRAM controller sends
Write command to second
CPU DIMM
¾ Command propagates to all
DRAMs with the same timing
FSB

due to the length-matched


routing.
¾ Write data follows the
command by one clock
Mem synchronously.
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Blue is Address and Command Bus
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
DDR2 Write Overview 201

¾ Write Cycle in DDR2


¾ DRAM controller sends
Write command to second
CPU DIMM
¾ Command propagates to all
DRAMs with the same timing
FSB

due to the length-matched


routing.
¾ Write data follows the
command by one clock
Mem synchronously.
Ctrl ¾ Write data reaches all of the
DRAMs at the same time.
DMI

IO
Ctrl

Key: Red is Data Bus


Blue is Address and Command Bus
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
On-Die Termination

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
On-Die Termination 203

• ODT Example: ODT reduces reflections from the stubs to DIMMs not being
addressed during write cycles. ODT is also used by most DRAM controllers
during read cycles. Settings are not intuitive when more than two DIMMs are
used. There are 3 settings 50, 75 and 150 Ohms
• One ODT signal per Rank.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Signal Integrity with ODT 204

More noise margin, approx. 400mv gain

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Off Chip Driver Calibration 205

¾ Off Chip Driver Calibration was implemented on


the DIMMs when DDR2 was introduced. The drive
strength granularity was set too large on most
designs so it was not recommended that the
chipset run the calibration sequence. Resurrected
in DDR3.
¾ This feature is not used in DDR2.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Errors

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Errors and Error Handling 207

Two basic types of error:


¾ Hard Errors – These exist in the cell array and are
thought to be caused by a bad cell, damage to the
DRAM etc. These errors account for less than
10% of the errors in a DDR subsystem.
¾ Soft Errors – Occur when electrical anomalies are
present on the transmission line. These could be
caused by poor design and are expected in higher
speed systems.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Error Checking and Correction (ECC) 208

¾ Error Checking and Correction is used in


systems that require better reliability.
¾ Many servers use ECC DIMMs to guarantee
data integrity.
¾ This is done by adding an extra 8 bits for
every 64 bits. These 8 bits are referred to as
the check bits.
¾ This feature is referred to as SEC (Single
Error Correction) and DED (Double Error
Detection)

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ECC 209

¾ ECC is totally system dependent. What a


system does when a double or multi-bit error
is detected is up to the system designers.
¾ Since most errors are soft errors, a retry may
fix the problem.
¾ Many controllers will just allow the error to
propagate and cause a “blue screen of
death”.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Dual-Channel ECC 210

¾ Many servers use two 64-bit channels with 16


bits of ECC.
¾ This enables Chip Kill (IBM) or S4EC and
D4ED (Intel)
¾ The feature is able to fix 4 consecutive
nibble-aligned bits. An example of this would
be most servers that use x4 DRAMs.
¾ If one of the DRAMs were to go bad, the
controller could correct the errors on the fly
and alert the administrator that errors were
happening.
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
ECC Initialization 211

¾ This is a mandatory step for ECC-based


systems, especially those with automatic
scrubbing that reads memory.
¾ All of memory is written with correct ECC
before being read.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ECC and Scrubbing 212

¾ Scrubbing can be done while the system is


running to correct single-bit errors (which are
correctable).
¾ This reduces the chances of multi-bit errors
(which might not be correctable).
¾ The entire DRAM subsystem is periodically
scrubbed while making sure not to interfere
with system performance.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DIMM Parity 213

¾ Parity is for Registered DIMMs only.


¾ This feature is added to the register on DDR2
DIMMs at speeds of 533MT/s and higher.
¾ When a command is sent to the DIMM, the
register performs a parity check on the
Command and Address bus.
¾ If a parity error is detected the QERR# out pin
is asserted low.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Additional DDR3 Topics

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Timing and Electrical Differences

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Timing and Electrical Differences 216

¾ tCCD or CAS-to-CAS delay is no longer


variable. In DDR2 it was ½ the burst length.
DDR3 it is set to 4. (BL = 8)
¾ Transmitter impedance went up to approx. 34
ohms. DDR2 was approx. 18 ohms. This
equates to less power consumption.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Fly-by Routing Read Example

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DDR3 Read Overview 218

¾ DDR3 looks completely different?


¾ How different?
¾ The following slides show an example of the
read data arrival time at the memory
controller in a DDR3 UDIMM system.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Read Overview 219

¾ New fly-by routing


¾ Data bus is still length
matched within each data
CPU group to its associated
strobe. There does not seem
to be any implied length
matching across the data
FSB

groups

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Read Overview 220

¾ Read Cycle
¾ Read command is sent to
the DIMM over the address
CPU and command bus.
FSB

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Read Overview 221

¾ Read Cycle
¾ Read command is sent to
the DIMM over the address
CPU and command bus.
¾ The command gets to the
first DRAM on the new fly-
FSB

by routing chain.

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Read Overview 222

¾ Read Cycle
¾ Read command is sent to
the DIMM over the address
CPU and command bus.
¾ The command gets to the
first DRAM on the new fly-
FSB

by routing chain.
¾ The command propagates
to the remaining DRAMs

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Read Overview 223

CPU
FSB

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus Termination on DIMM
Do Not Distribute .com © 2009
DDR3 Read Overview 224

¾ Read Cycle cont.


¾ After RL, data begins
streaming back from the
CPU first DRAM.
¾ Data propagates from the
remaining DRAMs in order
FSB

that the command was


received.

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Read Overview 225

¾ Read Cycle cont.


¾ Data propagates from the
remaining DRAMs in order
CPU that the command was
received.
¾ Read data from the first byte
FSB

lane arrives.
¾ The remaining data lanes
arrive one after the other.

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Read Overview 226

¾ Read Cycle cont.


¾ Data propagates from the
remaining DRAMs in order
CPU that the command was
received.
¾ Read data from the first byte
FSB

lane arrives.
¾ The remaining data lanes
arrive one after the other.
¾ What is the problem with
Mem this?
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
Read Calibration

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Read Calibration 228

¾ To overcome this skewing of data the


memory controller must calibrate the read
data.
¾ Most of the Read calibration will be done by
the memory controller.
¾ There are predetermined patterns built into
the DRAM’s Multi-Purpose Register (MPR) to
facilitate the calibration. The read de-skewing
must be done on all byte lanes of every rank
every time the system reboots.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Read Calibration 229

During read calibration the controller initiates a predetermined pattern via


MR3. The pattern can continue as long as the controller needs to calibrate.

T0 T1 T2 Ta Tb Tc Td Te Tf
CK#

CK

A[13:0] MPR

COMMAND NOP Read NOP NOP NOP NOP NOP NOP NOP

Preamble

DQS/DQS#
(from SDRAM)
D D D D D D D D
DQ 1 0 1 0 1 0 1 0

(from SDRAM)
Predetermined
Pattern

MPR is the new multi-purpose register


Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Fly-by Routing Write Example

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DDR3 Write Overview 231

¾ As in DDR1 and DDR2, DDR3 SDRAMs require a


certain phase relation between the strobe signals
and the clock signals during writes.
¾ This results in a certain write data launch time requirement
at the memory controller.
¾ In DDR1 and DDR2, the write data launch time is
equal for all byte lanes of a DIMM.
¾ This is achieved via flight time length matching on the
mother board and on the DIMM.
¾ Depending on the channel routing, it may even be equal
among 2 DIMMs.
¾ In DDR3, the write data launch time will be different
¾ across the byte lanes for a single DIMM
¾ from one DIMM to another DIMM

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Write Overview 232

¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
FSB

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Write Overview 233

¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Write Overview 234

¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB

¾ Command propagates to the


remaining DRAMs.

Mem
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Write Overview 235

¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB

¾ Command propagates to the


remaining DRAMs.
¾ After WL, the controller
starts driving the first data
Mem lane.
Ctrl
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Write Overview 236

¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB

¾ Command propagates to the


remaining DRAMs.
¾ After WL, the controller
starts driving the first data
Mem lane.
Ctrl ¾ The other data lanes follow
in order.
DMI

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Write Overview 237

¾ Write Cycle
¾ Write command is sent to
the DIMM over the address
CPU and command bus.
¾ Command reaches the first
DRAM due the fly-by routing.
FSB

¾ Command propagates to the


remaining DRAMs.
¾ After WL, the controller
starts driving the first data
Mem lane.
Ctrl ¾ The other data lanes follow
in order.
¾ Data does not arrive at the
DMI

same time to all DRAMs.

IO
Ctrl

Key: Red is Data Bus


Min Huang(min.huang@ lecroy.com)
Blue is Address and Command Bus
Do Not Distribute .com © 2009
DDR3 Write Leveling 238

¾ The write data launch time at the controller varies


¾ from byte lane to byte lane
¾ from DIMM to DIMM
¾ from system to system.
¾ Therefore, it must be calibrated at the memory
controller.

¾ How can this calibration be done in a DDR3 System?


¾ How does the DDR3 SDRAM support this
calibration?

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Write Leveling

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DDR3 Write Leveling 240

If each data lane is being


DRAM DRAM DIMM DRAM
launched at different times
how can the controller know
when to launch them?

DQS DQ DQS DQ DQS DQ

CK

By calibrating each byte


lane on every DIMM
utilizing a phase detector Controller
available inside the DDR3
SDRAM during
"Write Leveling Mode"

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Write Leveling 241

¾ The phase detector on the DDR3 SDRAM can be understood as a flip-flop:


¾ DRAM samples the clock status with rising edge of the strobe
¾ and provides the sample result on the DQ pins after an asynchronous delay of tWLO.
¾ No strobe information is driven out with the DQ information since the strobes are used as
trigger input for the flip-flop.
¾ Like every flip-flop, the phase detector flip-flop on the DDR3 SDRAMs has a requirement for
minimum setup and hold times.
¾ Besides the pure flip-flop timing requirements, the imperfect delay matching between clock pads
to flip-flop and strobe pads to flip-flop imposes additional timing constraints.
¾ As a result, the clock signals must be stable.

CK

DRAM

D
Q

DQS DQ

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 Write Leveling 242

¾ In DDR2, the write data launch time is equal for all byte lanes of a DIMM,
sometimes even among two DIMMs within a channel
¾ This is achieved through flight-time length matching on the mother board and on the
DIMM.
¾ In DDR3, this is completely different due to the fly-by
command/address/control/clock bus topology on the DIMM:
¾ The write data launch time is different across the byte/nibble lanes of a DIMM.
¾ The write data launch time is different from one DIMM to another DIMM.

¾ Therefore, a DDR3 memory controller must be able to calibrate the launch


time for every byte/nibble lane for every DIMM
¾ The recommended timing resolution at the memory controller is 1/16 tCK
¾ The launch times may be spread over the boundaries of a clock cycle at the
memory controller.
¾ A DDR3 memory controller
¾ must store different launch time settings per byte/nibble lane and per slot
¾ must dynamically switch between slot settings!

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR3 On-Die Termination

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
DDR3 ODT 244

¾ ODT is designed to improve signal integrity of the memory


channel by allowing the memory controller to independently turn
on/off termination resistance at the DQ bus interface for any or all
SDRAM devices via the ODT pin.
¾ ODT is implemented on the DDR3 SDRAM as selectable, center-
tapped termination resistance on the following DQ bus pins:
¾ x16 DRAMs: DQU, DQL, DQSU, DQSU#, DQSL, DQSL#,
DMU, DML
¾ x4 and x8 DRAMs: DQ, DQS, DQS#, DM
– and TDQS, TDQS# for x8 DRAMs, when enabled via
A11=1 in MR1

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ODT Circuitry 245

¾ Conceptual Circuit Diagram


¾ ODT can be compensated using an external resistor
¾ It is not clearly stated if actual silicon resistors are used or it is the Ron
of the Pull Up and Pull Down transistors.

VDDQ
PU FET Control
SW1 SW2 SWN

RTT PU RTT PU RTT PU


ODT Control
Circuitry DQ
240 Ω
PD FET Control RTT PD RTT PD RTT PD

SW1 SW2 SWN


VSSQ

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ODT Changes 246

¾ The following ODT modes are available:


¾ Synchronous ODT
¾ Selected whenever the DLL is turned on and locked:
– Active mode
– Idle mode with CKE high
– Active power down mode (regardless of MR0 bit A12)
– Precharge power down mode if DLL is enabled during
precharge power down by MR0 bit A12.
¾ Asynchronous ODT
¾ Selected when DRAM runs in DLL on mode, but DLL is
temporarily disabled.
– Precharge power-down with slow exit (selected by MR0 bit
A12).

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ODT Write Example Synchronous Mode 247

¾ Write to furthest slot from Controller


¾ RTT_Nom 20 Ω
¾ RTT_WR 120 Ω

Deselected DIMM Receiving DIMM

DIMM DIMM
VTT VTT

RTT_Nom RTT_WR

DRAM Controller

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ODT Changes 248

¾ In dual slot DDR3 systems, the slot which is not accessed


during a read or write transaction needs to terminate the bus
with low impedance (e.g. 20 or 30 Ohms).
¾ When writing to the slot which is configured to terminate with
low impedance, a higher impedance is required to achieve open
data eye.
¾ Thus, not only two termination options are needed (RTT on/off),
but three options must be available without MRS interaction:
¾ RTT turned off
¾ RTT = Low impedance during reads/writes from/to the deselected
slot
¾ RTT = High impedance during writes to the active slot

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ODT Register Settings 249

¾ The values 1-3 can only


Mode Register 1 (MR1)
be used for the
RTT_Nom RTT_Nom if deselected Ranks
A9 A6 A2
(nominal) RZQ=240 Ω
during writes.
disable
0 0 0 (also dyn. off ¾ Values 4 and 5 can be
ODT) used on the deselected
0 0 1 RZQ / 4 60 Ranks during reads
0 1 0 RZQ / 2 120 (although not
0 1 1 RZQ / 6 40 specifically stated by
JEDEC)
1 0 0 RZQ / 12 20

1 0 1 RZQ / 8 30

1 1 0 RFU RFU

1 1 1 RFU RFU
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
ODT Register Settings 250

¾ The values 1 and 0 are


Mode Register 2 (MR2) used for the selected
RTT_WR RTT_WR if Rank during writes only.
A10 A9
(during WR) RZQ=240 Ω

Dynamic ODT off: Write does


0 0
not affect RTT value

0 1 RZQ / 4 60

1 0 RZQ / 2 120

1 1 RFU RFU

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ZQ Calibration (OCD)

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
ZQ Calibration 252

¾ The Off-Chip Driver (OCD) calibration protocol


for DDR2 was not used because the granularity
of the drive strength was too large.
¾ Both OCD and ODT are calibrated and adjusted
against an external reference resistor connected
to the ZQ pin of DDR3 chips.
¾ Calibration is triggered by a calibration command
from the memory controller.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
ZQ Calibration 253

ZQ calibration is done after reset and is recalibrated periodically to compensate for


voltage and temperature fluctuations. DRAM vendors recommend that calibration be
done every 128mS. Below is an example of an 8-bit Rcomp binary-legged circuit.

VDD

[7] [2] [1] [0]

PMOS
Rcomp[7:0]
Rcomp [7:] VDDQ

(256)
(128) (4) (2) (1)
Transmit Output
Signal PAD
(256)
(128) (4) (2) (1)

NMOS VSSQ
Rcomp[7:0]
RComp [7:0]

[7] [2] [1] [0]

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Reset

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Asynchronous Reset 255

¾ DDR3 DRAM implements a RESET# pin which is an asynchronous external control


signal of the DDR3 SDRAM.
¾ This RESET# signal forces DRAM from operational or non-operational conditions into
a defined state.
¾ RESET# can be applied asynchronously to an on going DRAM operation and can be
asserted at any time.
¾ After RESET# is asserted, data in DRAM may be lost (though RESET# is not
designed to destroy the data) and DRAM needs to be re-initialized, which includes
(but not limited to) load mode registers and DLL reset.
¾ Note for Application: In order to maintain data inside the DDR3 SDRAMs, the RESET# signal must
not be applied upon EXIT from S3 (suspend to RAM).
¾ PCI RST# should not be used.
¾ It is mandatory for system to hold RESET# asserted at beginning of power ramp until
power supplies reach stability.
¾ Minimum pulse width of RESET# at DRAM input:
¾ At power ramp up: min. 200 µs
¾ After power ramp up, during stable supply conditions: min. 100 ns
¾ Since there may be many DRAMs connected to the RESET# signal, system needs to generate a
pulse with a larger width to ensure minimum pulse width is achieved at DRAM RESET# input.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
On-DIMM Address Mirroring

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
On-DIMM Address Mirroring 257

¾ What is the motivation?


Pin 1

Front View
SPD

Top View

Pin 1

Pin 1

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
On-DIMM Address Mirroring 258

Pin A0

Top DRAM

Problem: Traditional DIMM has


longer stub and complex routing
due to un-mirrored top and bottom
ball-out Bottom DRAM

Pin A0

Pin A5 Pin A6

Top DRAM

Solution: Mirrored DIMM has simpler


routing that decreases stub length

Bottom DRAM

Min Huang(min.huang@ lecroy.com)


Do Not Distribute Pin A6 Pin A5
.com © 2009
On-DIMM Address Mirroring 259

1 2 3 4 5 6 7 8 9

¾ Only these pins may be A


VSS VDD NC NC VSS VDD

mirrored: B
VSS VSSQ DQ0 DM VSSQ VDDQ

¾ BA0 and BA1 C


VDDQ DQ2 DQS DQ1 DQ3 VSSQ

D
¾ A3 and A4 E
VSSQ NC DQS# VDD VSS VSSQ

¾ A5 and A6 F
VREFDQ VDDQ NC NC NC VDDQ

NC VSS RAS# CK VSS NC


¾ A7 and A8 G
ODT VDD CAS# CK# VDD CKE

¾ Controller must be told whether H


NC CS# WE# A10/AP ZQ NC

a DRAM is mirrored so that it J


VSS BA0 BA2 A15 VREFCA VSS

can access the Mode Registers K


VDD A3 A0 A12/BC# BA1 VDD

correctly. SPD contains this L


VSS A5 A2 A1 A4 VSS
information. DDR3 SPD spec M
VDD A7 A9 A11 A6 VDD
was released June, 2008. N
VSS RESET# A13 A14 A8 VSS

x4 DDR3 SDRAM,
looking through the package

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DRAM Controller Basics

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Controller Basics 261

¾ Controller design is completely


implementation-specific. JEDEC specifies AC
timings and DC levels as well as implied DQS
receive delay.
¾ There are some common building blocks that
a controller must have and some blocks that
are optional. The focus in the chapter will be
on PC-based DIMM controllers.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Required Blocks 262

¾ Address and Control Mux


¾ Refresh Timer
¾ PLL
¾ Timing Generator
¾ Control Register
¾ Read and Write Buffer
¾ IO Buffer Pads

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
DDR Controller Block Diagram 263

DIFF CK
REFRESH
TIMER PLL
ADDR
CRTL
ADDR/CRTL MUX TIMING STATE
PADS
MACHINE ADDR/CRTL
ROW
REQUEST CMP

READ
READ AND
WRITE
WRITE BUFFER
CONTROL REGISTERS DATA
PADS
DATA/DQS

READ AND WRITE DATA

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Address and Control Mux 264

¾ This unit uses the physical address to create


the Bank address, Row address and Column
address.
¾ It also drives the correct CS# and generates
the command sequence.
¾ This unit also does Row compare to see what
Rows need to be open (active).

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
PLL 265

¾ The PLL receives an input clock from an


external source and delivers multiple outputs
to the functional units of the controller.
¾ The PLL makes multiple copies to send to the
DIMM.
¾ 3 copies for each Unbuffered DIMM
¾ One copy for each Registered DIMM

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Timing/State Generator 266

¾ The Timing Generator connects directly to the


IO Pads.
¾ This unit holds all of the timing parameters
collected during initialization.
¾ Typically this unit would work with a
calibration unit to optimize timing and turning
the IO Pads on and off.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Control Register 267

¾ These are PCI configuration registers as well


as MMIO registers that most of the units
access to gain information about the memory
that is attached.
¾ BIOS programs these registers during
initialization.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Read and Write Buffer 268

¾ The read and write queues hold data going


into and out of memory.
¾ Typically this is a combining queue that can
merge the changed bits with the existing bits
and write them back to memory when
bandwidth is available.
¾ Each buffer entry is a cache line in size.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
IO Pads 269

¾ The IO Pads are the analog circuits used to


drive the buses.
¾ Rcomp Scomp are found in these circuits.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Performance Enhancement Blocks 270

¾ Additional units that are system dependent:


¾ Arbitration/Gearing for multiple request inputs and
different clock domains
¾ ECC generation and checking
¾ Address and Command Parity checking usually
accompanied by a retry buffer
¾ Pad Calibration Unit

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Addresses

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
What is the Address? 272

Beware that the address that the


programmer uses (the offset) is not the
physical address on the FSB nor on cHT, and
the physical address is not the address on
the memory bus.

Beware the three “interleaved addresses”:


• Intel’s toggle-mode addressing
• Dual channels ganged
• NUMA interleaving

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
NUMA Addresses 273

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Dual-channel Address Schemes 274

Dual channel can be done several different ways.


¾ Interleave and Lock Step (Ganged)
¾ If like pairs of DIMMs are populated across each channel.
¾ The MCH will interleave cache lines (Interleave) or partial
cache lines (Lock Step) across each channel.
¾ Asymmetric
¾ If the DIMM are not exact pairs then the MCH will align the
addresses sequentially through memory from the furthest
DIMM to the nearest in one channel then the other.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Example of Address Translation 275

Memory address
Processor’s physical address and control signals
CS0#
Address CS1#
A47:A32 CS2#
Decoder CS3#

A31:A17 Row Address Memory address


bit 15 is not used
in this example
Processor A16:A14 Bank Address

A13:A3 Column Address

Example for 4GB, made up of 2Gb x4 DRAM chips,


on one 64-bit channel, as suggested in AMD’s BKDG.

See handout showing 8GB DDR2 with Intel 955 MCH.


Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
Alternative DRAM Solutions

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009
Fully-Buffered DIMM 277

¾ Allows more DIMMs (more memory) per channel while maintaining signal integrity
¾ Existing RDIMM solutions only support two DIMMs per channel. FBDIMMs allow more.
¾ Address, Control and Data buffered
¾ Data buffers isolate the DRAM voltage and data stubs
¾ All interface signalling is differential
¾ Performance
¾ Faster processors require higher memory throughput
¾ Simultaneous Read/Writes to two Fully-Buffered DIMMs
¾ Up to 36 devices behind each buffer
¾ 256, 512, 1 & 2 Gbit DRAM support
¾ Input clock is ½ Dram base clock. Bit rate per lane is 6X the DRAM speed ie. 533MT/s = 3.2G
bit times.
¾ Cost Sensitive Market
¾ No DRAM changes, uses commodity DRAM chips
¾ DDR2 DIMM form factor/connector with industry available reference design and Gerbers

24 Unidirectional Differential Pairs


10 pairs Southbound,
14 pairs Northbound
10

MCH
14

Narrow Point-to-
Point Interface
Min Huang(min.huang@ lecroy.com)
Do Not Distribute .com © 2009
GDDR 278

¾ Used in high-end graphics cards.


¾ GDDR1 and GDDR2 (backed by NVIDIA)
¾ GDDR3 (backed by ATI)
¾ GDDR4 first samples out by Samsung
¾ Don’t be confused: GDDR3 is not JEDEC DDR3.
¾ GDDR uses faster clocks and lower voltages using specially
binned production DRAM parts as well as changing internal
timing parameters.
¾ Interfaces are proprietary from GDDR to the GPU.
¾ One architecture routes 2 sets of address and command
lines, one to each rank, with a shared data bus.

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
RL DRAM 279

¾ Reduced Latency DRAM is a Micron product


aimed at cache-like memory subsystems for
communications and graphics.
¾ RL DRAM is a spin off of DDR2.
¾ The key difference is that the time from the
request to the data transfer is much shorter,
and RL DRAM is cost-effective than other
high speed solutions

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
XDR 280

¾ XDR and XDR2 are follow-on to Rambus.


¾ Used in high-end, compact systems such as game
stations and some graphics cards.
¾ Bidirectional (16 lane) differential Rambus signaling
levels (200mV)
¾ Programmable ODT
¾ Adaptive impedance compensation
¾ Up to 4 bank interleaved transactions at once.
¾ Early read-after-write capability for higher bandwidth.
¾ Zero overhead for refresh

Min Huang(min.huang@ lecroy.com)


Do Not Distribute .com © 2009
Thank you!
Please send questions to eLearning@mindshare.com

Min Huang(min.huang@ lecroy.com)


Do Not Distribute mindshare.com © 2009

You might also like