Introduction To FPGA HSIO

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

Introduction to FPGA

HIGH SPEED IO
“Microsoft has had clear competitors in the past. It’s a good thing we have
museums to document that.” ~ Bill Gates

2
[Courtesy] https://percepticon.wordpress.com/material/open-data/internet-diffusion-and-usage-statistics/
Objective of this Seminar
High Speed I/O Problem Statement
Vocabulary
Techniques
Design Flow with Intel FPGAs
Hands on Lab

3
The World is Going Serial – PC back panel
Old School (90s) Newer School – 2010s

Ethernet
USB

HDMI
SATA
Parallel Interfaces Serial Interfaces

4
Why not Parallel interfaces?
Data
The
The
“Channel”
“Channel”
PCB B
Clock Source
Chip A
tsetup thold
Clock
C C
o o
n n
n n
e
c
e
c
Chip B
PCB A t
o
t
o
r r
Data
Valid
Slow timing transition Different wire lengths
Window
makes meeting data valid
Fast timing transition window difficult

Bit 1 Bit 2 misses setup time

Bit 2

Bit n Bit n misses hold time

Differences in setup/hold window due to differences in board trace lengths

5
Dynamic Phase Alignment (DPA) vs Clock and Data Recovery (CDR)
SERDES with DPA SERDES with DPA

Data

Parallel
Parallel
to
to Serial
Serial TX Clock RX Serial
Serial to
to
Parallel
Parallel

Sending this clock with the data is


problematic! This technique caps out at
roughly 3 Gbps

CDR: “Transceiver” in FPGA Terminology

Data with embedded clock


Parallel
Parallel
to
to Serial
Serial TX RX Serial
Serial to
to
Parallel
Parallel
This technique is currently working up to 58
Gbps in Intel Stratix 10

Serializer/Deserializer or SERDES is commonly used to describe both techniques. Intel PSG calls CDR Transceiver

6
Stages of Data transfer

7 6 5 4 3 2 1

7. Application

6. Presentation

5. Session

4. Transport

3. Network

2. Datalink

1. Physical

7
LVDS Dynamic Phase Aligner

This technique is used in many Intel PSG families - caps out at roughly 3 Gbps
LVDS = Low Voltage Differential Swing is a special type of CMOS I/O cell that can run at high data rates

8
Centering the clock with DPA circuitry – 8 tap PLL
D0 D1
Degree
Shift
0

45

90

135

180

225

270

315

9
Rates beyond DPA? CDR – high precision clock phase shift!
Buffer Buffer

Deserializer

Six main blocks to a self synchronous interface:


1. Serializer (Parallel-to-Serial Conversion)
2. TX Buffer
3. Channel/Transmission Lines
4. RX Buffer
5. Deserializer (Serial-to-Parallel Conversion)
6. CDR: Clock Data Recovery

Question: Hey where’s the clock? How can you get the data to align across the channel?

Answer: CDR (Clock and Data Recovery)


10
Clock Data Recovery
What is it?
 Recovers a clock signal from incoming serial
data. CDR locks to data and produce a stable
recovered clock signal
Why is important?
 Save the need of a dedicated clock
transmission lines.
 Eliminates bit errors due to over/under
sampling.
How it works?
 Explanation needs more space …

11
Clock Data Recovery Circuit
PFD PD Charge Pump & LF VCO
 Measure  Measure phase  Translate between  Electronic oscillator
phase/frequency differences between PD/PFD output and whose oscillation is
differences between serial data input phase VCO control voltages. controlled by a
reference clock and and divided output. voltage source.
divided output.

12
Clock Data Recovery Sequence

LOCK TO
REFERENCE

LOCK TO
DATA

13
Phase Detector Circuit
Alexander Phase Detector
 Samples serial data in three consecutives clock  LEADS: reference clock edge is early with
edges. respect to data edge.

 PD determines if the clock leads or lags the data.  LAGS: reference clock edge is late with respect
to data edge.

14
Voltage-Controlled Oscillator

Ring Oscillator VCO


 Ring oscillator is a chain containing an odd
number of inverters in which output is connected Inverter Supply Voltage
to input as feedback

 The oscillation frequency of the ring VCO can be


determined by estimating the delay time τ of each
inverter stage.

 The frequency of oscillation is determined by the


voltage provided by the charge pump.

15
Challenges in High Speed I/O
Standards of Data Exchange
 Commonality in understanding the data

 Compatibility to operate between different


interfaces

 Protocols define the method of exchanging data

Integrity of Signals

 Signal distortions occur through the medium

 Digital with fixed high and low levels looks


more like analog
 Recovery and interpretation of this data poses
challenges
 Think analog when dealing with the signals at
the physical layer

16
Physical Coding Sublayer (PCS) to Physical Media Attach (PMA)
0.5GBps 0.5GBps 0.5GBps 0.5GBps
Fabric PCS PMA

Higher
voltage
0.5GBps 0.5GBps 0.5GBps 5GBps
x10 x10 x10

Fabric PCS PMA

High speed
÷ (serial) clock
Lower speed (parallel) clock

17
Native PHY L/H Tile UG Pages: 231 – 295 : clocking section
Transmitter Physical Coding Sublayer (PCS)
Transmitter PCS consists of:
 Phase Compensation FIFO
– Regulate the availability of data between 2 clock domains

 Byte Serializer
– Convert wide parallel data into byte size (narrow parallel data)
– Eg: 16-bit wide into 8-bit wide

– Fast clock and a slow clock (half the speed of fast clock)

 Encoder
– Converts information from one format to another for dc-balancing
– Schemes like 8b/10b, 64b/66b

Transmit Without encoding Receive


Dc imbalance

18
How to handle long sequences of zeros and ones in a
row? Answer: Physical Coding Sublayer (PCS)
0 0
0 0
1 1
8 Bit word 1 1 Encoded 10 Bit word
6 One’s and 2 Zero’s 1 1 5 One’s and 5 Zero’s
(1 transition, unbalanced ones 1 1 (2 transitions, balanced
and zeros) 1 1 ones and zeros)
1 0
0
0

Control character for ‘Beacon’ (K28.8)

19
Coding: 8B10B
20% overhead to add transitions
(vs 100% overhead for
Manchester)

Maps 8 bit symbol to 10 bit


symbol (combines 5b6b and 3b4b)

DC Free – Long Term ratio of ones


and zeroes is exactly 50%

If unequal number of ones or


zeroes buildup running disparity
inverts the data sent

Special framing “K” characters


delimit data stream

20
Physical Media Attach (PMA) – The Analog world

TX Transmission medium RX

PCS PMA PMA PCS

Clean edges +
Boost signal
Recover clock

21
Native PHY L/H Tile UG Pages: 320 - 333
Transmission medium reaches and applications

22
Sources: Intel PAM4 App note and http://www.ethernetalliance.org/wp-content/uploads/2014/10/41014-DRAFT-TEF-56Gbs.pdf
TX – Getting the signal across
Transmission medium

Pre/De Emphasis
Pre-Tap
Post-Taps

23
RX – Signal recovery
RX Equalization – Calibration
Serial
Data
Serial
Data In VCM CTLE VGA DFE

CDR

Boost
Boost
R1
Serial
Clock
R1

C1
R2
R2

GND1
gnd

frequency

24
Native PHY L/H Tile UG Pages: 402 – 411 : Calibration section
Some common High Speed IO Protocols
• PCIe – Serial Computer Expansion bus
• Ethernet - networking
• Interlaken – chip to chip
• CPRI – Common Public Radio Interface – Tower to wireless basestation
• USB – Universal Serial Bus – computer to peripheral connectivity (and so much more)
• HDMI – High Definition Multimedia Interface

+ dozens more!
*supporting so many transceiver protocols makes the design challenging!

25
Coding Standards by Protocol
Standard Line Code
Ethernet 1-10 Mbps Manchester
Ethernet 100 Mbps 4b5b
Ethernet 1Gbps 8b10b
Ethernet 10Gbps 64b66b
Ethernet 40Gbps (4x10) 64b66b
Ethernet 100Gbps (10x10, 4x25) 64b66b
PCIe Gen 1 (2.5 Gbps) 8b10b
PCIe Gen 2 (5 Gbps) 8b10b
PCIe Gen 3 (8 Gbps) 128b/130b
PCIe Gen 4 (16 Gbps) 128b/130b

26
Eye diagram

27
History of Intel PSG Transceivers
Data Rate vs Process Node
Data Rate vs Year 56 56
60

Data Rate (Gbps)


60 56 56
40 28 28
Data Rate (Gbps)

40 28 28 20 11.1
3.19 6.38
20 11.1 0
3.19 6.38 0 130 90 60 40 28 20 14 10
0
2002 2004 2006 2008 2010 2014 2015 2017 Process Node
Year

Stratix History 
Memory
  Year Node Data Rate Transceivers KLEs 18x18 (Mbit) PLL

Process Stratix 2002 130 3.1875 20 79 176 8 12


Year High End Mid Range Low Cost
(nm)
130 2003 Stratix GX - - Stratix II 2004 90 6.375 20 179 768 9 12
90 2006 Stratix II GX - -
Stratix III 2006 60     338 896 16 12
65   - - -
Arria II Stratix IV 2008 40 11.1 48 813 1,280 23 12
40 2009 Stratix IV GX/GT -
GX/GZ
Arria V Cyclone V Stratix V 2010 28 28 66 952 3,926 52 28
28 2012 Stratix V GX/GT
GX/GT/GZ GX/GT
Arria 10
20   - - Arria 10 2014 20 28 144 3,008 10,560 152 48
GX/GT
Stratix 10 2015 14 56 144        
14   Stratix 10 GX/GT - -   2017 10 56          

28
Transceiver Design Flow

1
optional

29

You might also like