2020 Mayank Aggarwal Masc

Low power analog front end design for 112 Gbps PAM-4
SERDES receiver
by
Mayank Aggarwal
A thesis submitted in conformity with the requirements

for the degree of Master of Applied Sciences
Graduate Department of Electrical and Computer Engineering
University of Toronto
Copyright c 2020 by Mayank Aggarwal

Abstract
Low power analog front end design for 112 Gbps PAM-4 SERDES receiver
Mayank Aggarwal
Master of Applied Sciences
Graduate Department of Electrical and Computer Engineering
University of Toronto
2020
This thesis proposes an analog front end (AFE) design of a 112 Gbps PAM-4 SERDES
receiver in 16 nm FinFET technology. It consists of a front-end termination block
and a CMOS inverter based continuous time linear equalizer (CTLE). The high trans-
conductance of the CMOS inverter as an amplifier provides a low power equalization
solution. The front end-termination network is exploited to achieve certain passive equal-
ization. The common-mode feedback loop (CMFB) improves the common-mode rejection
ratio (CMRR) of the CTLE and it is a low power biasing solution in comparison to a
self-biased diode connected inverter load. The tunability in the CMOS tristate inverter
cell enables power-scalability in the design. This also helps to track PVT variations.
Using the CTLE as the only means of active equalization for a long reach channel (back-
plane) with 30 dB attenuation at 28GHz nyquist frequency, this AFE design achieves
17dB of equalization consuming 10 mW power with a 0.8 V supply voltage in post layout
extracted results.
ii
To my beloved mother, brother and bhabhi ♥
Acknowledgements
Live as if you were to die tomorrow.
Learn as if you were to live forever.
Mahatma Gandhi
First of all, I want to thank my supervisor Prof. David Johns, who has been very friendly
from the day I came in contact with him. His guidance led to lot of improvement in my
personality and technical knowledge. Without his support, this thesis work would not
have been possible. It is not possible to express my gratitude towards him in words.
I want to thank Prof. Tony Chan Carusone, Prof. Sorin Voinigescu and Prof. Andreas
Moshovos for serving as my thesis defense committee members. Sincere thanks to Dr.
Hossein Shakiba and Dr. Dustin Dunwell for their continuous support and mentorship.
I would like to thank Kunal Yadav for being there with me always in the hour of need.
Many thanks to Sameer Sharma for getting this thesis compiled in proper format and I
will miss table-tennis matches with you. I would like to thank Dhruv Patel and Xunjun
for sharing their experiences. I also want to thank Durand, Milad and Alireza Akbarpur
for accompanying me during the tapeout struggles. In addition, I want to thank Foad,
Danial, Paul, Saba, Rudraneil, Jack, Pooya, Miad, Constantine, Richard, Danny, Ali,
Jeremy, Mohammed, Suyash, Vikram and Jose for making this journey enjoyable. I hope
we continue to cross paths professionally and non-professionally for years to come.
I want to thank my mom and dad for making me strong enough to easily cross hurdles
in the life. I want to thank my brother, Nitin who inspired me to go for higher education
in Canada. I wish I will meet the standards you expect from me, sometime in future.
Finally, I am indebted to god for providing me everything what I wish for. I am not
capable enough to go places, but you are the driving force behind everything. Thanks a
lot for believing in me. ♥
iv
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Typical SERDES system . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background 6
2.1 Data modulation techniques . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Need for equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Recent trend in the development of LR 112 Gbps SERDES . . . . . . . . 9
2.4 State of the art : Previous CTLE architectures . . . . . . . . . . . . . . 13
2.5 Thesis scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Receiver Analog Front-end Design 17
3.1 Long-reach channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Front-end termination network . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 CTLE design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.1 Basic design elements used in CTLE . . . . . . . . . . . . . . . . 24
3.3.1.1 CMOS tristate inverter as a basic amplifier unit . . . . . 24
3.3.1.2 Inverter as tunable active resistor : diode-connected load 29
3.3.1.3 Inverter as tunable active inductor . . . . . . . . . . . . 31
3.3.2 High-frequency boost stage (HF-CTLE) . . . . . . . . . . . . . . 33
v
3.3.3 Mid-band gain stage . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.3.1 Need for MF-CTLE . . . . . . . . . . . . . . . . . . . . 35
3.3.3.2 MF-CTLE design . . . . . . . . . . . . . . . . . . . . . . 43
3.3.4 Buffer stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.5 CMFB (Common Mode Feedback Loop) . . . . . . . . . . . . . . 45
3.3.6 Complete CTLE architecture . . . . . . . . . . . . . . . . . . . . 47
3.3.7 Output buffer and back-end passive network . . . . . . . . . . . . 49
3.3.8 Top-level layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Extracted Simulation Results 53
4.1 Magnitude response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Pulse response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Thermal noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Eye diagram for NRZ (PAM-2) data signaling . . . . . . . . . . . . . . . 57
4.5 Eye diagram for PAM-4 data signaling . . . . . . . . . . . . . . . . . . . 61
4.6 Tunability to track PVT variations . . . . . . . . . . . . . . . . . . . . . 63
4.7 Common-mode bias stability . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.8 Common-mode frequency response and PSRR . . . . . . . . . . . . . . . 71
4.9 Thermal noise effect on the eye diagram . . . . . . . . . . . . . . . . . . 73
4.10 Power breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.11 Comparison of inverter based CTLE with other recent works . . . . . . . 74
4.12 Conventional CTLE design results . . . . . . . . . . . . . . . . . . . . . . 76
5 Conclusion 83
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Appendix 85
Tunability information in detail . . . . . . . . . . . . . . . . . . . . . . . . . . 85
vi
Impact of MF-CTLE on the eye-diagram . . . . . . . . . . . . . . . . . . . . . 93
References 93
vii
List of Tables
2.1 Equalization component PPA analysis with data rates ≤ 56 Gbps [9]. . . 11
2.2 Equalization component PPA analysis with data rates ≥ 56 Gbps [9]. . . 11
3.1 CTLE model parameters in MATLAB. . . . . . . . . . . . . . . . . . . . 36
3.2 CMRR analysis for each stage in the CTLE. . . . . . . . . . . . . . . . . 46
4.1 PVT corners list with details. . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Comparison table for inverter based CTLE . . . . . . . . . . . . . . . . . 75
4.3 Comparison between the inverter based CTLE and the conventional CTLE 82
5.1 Tunability knobs information . . . . . . . . . . . . . . . . . . . . . . . . 86
viii
List of Figures
1.1 Global data traffic over years [2]. . . . . . . . . . . . . . . . . . . . . . . 2
1.2 OIF CEI-112G Development Application Space [4]. . . . . . . . . . . . . 3
1.3 Typical SERDES transceiver system. . . . . . . . . . . . . . . . . . . . . 4
2.1 Backplane channel and CTLE magnitude response. . . . . . . . . . . . . 8
2.2 Pulse response at the channel and the CTLE output (with partial equal-
ization). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Eye diagram: (a) at channel output; (b) at CTLE output with partial
equalization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Conventional equalization architecture for data rates ≤ 56 Gbps (with
equalizers highlighted in pink colour). . . . . . . . . . . . . . . . . . . . 12
2.5 Next generation equalization architecture for higher data rates (100+
Gbps) (with equalizers highlighted in pink colour). . . . . . . . . . . . . 12
2.6 CML based conventional CTLE architecture. . . . . . . . . . . . . . . . 13
2.7 CML based conventional CTLE architecture with negative capacitance
implementation [17]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8 CMOS inverter based CTLE. . . . . . . . . . . . . . . . . . . . . . . . . 14
2.9 CMOS inverter based CTLE for PAM-2 application [19]. . . . . . . . . . 15
2.10 CML and CMOS inverter based CTLE [20]. . . . . . . . . . . . . . . . . 15
ix
3.1 LR channel characterization with ideal 50 Ω termination: (a) Magnitude
response shows 30 dB loss at 28 GHz Nyquist frequency; (b) Pulse response
for pulse width corresponding to 56 Gbps data-rate; showing the pre, post
and main cursors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Front-end termination network. . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Magnitude response from channel input to CTLE input due to various
termination circuits. Proposed architecture behaves closest to the case
when the channel is terminated ideally. . . . . . . . . . . . . . . . . . . 21
3.4 Magnitude response of front termination network from the channel output
to the CTLE input with variation in Lt for the proposed architecture. . 22
3.5 Magnitude response from channel input to the CTLE input with variation
in Lt for the proposed architecture. . . . . . . . . . . . . . . . . . . . . . 23
3.6 Analog front end block diagram showing three stage CTLE. . . . . . . . 24
3.7 Analysis of CMOS tristate inverter cell as an amplifier. . . . . . . . . . 26
3.8 Analysis of CML based differential amplifier. . . . . . . . . . . . . . . . 26
3.9 CMOS tristate inverter configuratios as an amplifier. . . . . . . . . . . . 27
3.10 CMOS tristate inverter architecture with switches away from ends. . . . 28
3.11 Tristate inverter symolic representation. . . . . . . . . . . . . . . . . . . 29
3.12 Tunable active resistor. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.13 Tunable active inductor. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.14 Inverter amplifier drives active inductor load alongwith capacitive load CL . 32
3.15 High frequency boost stage. . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.16 Large Lzhf value helps to have better filter gain at high frequencies. . . 34
3.17 Mid-band CTLE improves equalization in terms of magnitude response. . 37
3.18 Mid-band CTLE has less post-cursor ISI with reference to the main-cursor. 38
3.19 Signal to post-cursor ISI ratio for CTLE pulse response. . . . . . . . . . 40
x
3.20 Eye diagram at the output of the CTLE without mid-band stage. Green-
box highlights more ISI due to the long-tail in the pulse response. . . . . 41
3.21 Eye diagram at the output of the CTLE with mid-band stage. Green-box
highlights lesser ISI due to the long-tail in the pulse response. . . . . . . 41
3.22 Center eye opening at the output of the CTLE without mid-band stage.
(Vertical eye opening = 26 mVpp and horizontal eye opening = 42 % UI ) 42
3.23 Center eye opening at the output of the CTLE with mid-band stage. (Ver-
tical eye opening = 60 mVpp and horizontal eye opening = 57 % UI ) . . 42
3.24 Mid-band gain stage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.25 Small-signal gain circuit for half-circuit of mid-band gain stage. . . . . . 44
3.26 Buffer stage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.27 CTLE architecture highlighting common-mode feedback loop. . . . . . . 47
3.28 Complete CTLE core architecture. . . . . . . . . . . . . . . . . . . . . . 48
3.29 CML based output buffer to drive the back-end passive network and the
outside world. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.30 Top-level layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1 CTLE magnitude response. . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 CTLE magnitude response for linear frequency scale. . . . . . . . . . . . 54
4.3 CTLE Pulse response. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Thermal noise spectral density for CTLE. . . . . . . . . . . . . . . . . . 56
4.5 Thermal noise spectral density for CTLE for linear frequency scale. . . . 57
4.6 Eye diagram at the channel output for 40 Gbps NRZ data rate (Pulse
PMR = 3.6). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.7 Eye diagram at the CTLE output for 56 Gbps NRZ data rate (Pulse PMR
= 3.14). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.8 Eye diagram at the CTLE output for 50 Gbps NRZ data rate (Pulse
PMR=2.85). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
xi
4.9 Eye diagram at the CTLE output for 40 Gbps NRZ data rate (Pulse
PMR=2.4). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.10 Eye diagram at the CTLE output for 112 Gbps PAM-4 data rate (Pulse
PMR=3.14). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.11 Eye diagram at the CTLE output for 64 Gbps PAM-4 data rate (Pulse
PMR=2.18). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.12 Eye diagram at the CTLE output for 40 Gbps PAM-4 data rate for default
settings (Pulse PMR=2.05). Equalizer needs to be optimized for this data
rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.13 Magnitude response coverage due to overall sweep of all tunability knobs
(highlighted red one is the default setting) . . . . . . . . . . . . . . . . . 63
4.14 CTLE magnitude response across PVT corners for: (a) un-optimized cir-
cuit (b) optimally-tuned circuit. . . . . . . . . . . . . . . . . . . . . . . . 65
4.15 CTLE pulse response across PVT corners for: (a) un-optimized circuit (b)
optimally-tuned circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.16 Eye diagram at the CTLE output for Strong corner for 50 Gbps NRZ data
rate for: (a) un-optimized circuit (Pulse PMR=2.7) (b) optimally-tuned
circuit (Pulse PMR=2.82). . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.17 Eye diagram at the CTLE output for Weak corner for 50 Gbps NRZ data
rate for: (a) un-optimized circuit (Pulse PMR=3.21) (b) optimally-tuned
circuit (Pulse PMR=2.97). . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.18 Eye diagram at the CTLE output for 50 Gbps NRZ data rate for: (a) SF
corner (Pulse PMR=2.75) (b) FS (Pulse PMR=3) corner. . . . . . . . . . 69
4.19 Common mode voltage at each stage of CTLE for 112 Gbps PAM-4 PRBS
data transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.20 Common-mode frequency response of the analog front-end for 0 dB input
common-mode AC signal at channel output. . . . . . . . . . . . . . . . . 71
xii
4.21 Transfer function of supply noise from supply to output common-mode
voltage at each CTLE stage . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.22 Transfer function of supply noise from supply to output differntial voltage
at each CTLE stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.23 CTLE eye diagram for 50 Gbps NRZ data rate (without thermal noise) . 73
4.24 CTLE eye diagram for 50 Gbps NRZ data rate (with thermal noise) . . . 73
4.25 CTLE power breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.26 Conventional CTLE architecture. . . . . . . . . . . . . . . . . . . . . . . 77
4.27 Magnitude response of the conventional CTLE. . . . . . . . . . . . . . . 78
4.28 Magnitude response of the conventional CTLE for linear frequency scale. 78
4.29 Pulse response of the conventional CTLE. . . . . . . . . . . . . . . . . . 79
4.30 Thermal noise spectral density . . . . . . . . . . . . . . . . . . . . . . . . 79
4.31 Total harmonic distortion . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.32 Eye diagram at the output of the conventional CTLE for 64 Gbps PAM-4
data rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.33 Eye diagram at the output of the conventional CTLE for 40 Gbps PAM-4
data rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1 CTLE architecture with tunability knobs information . . . . . . . . . . . 85
5.2 T rimhf −f ilter effect on the CTLE transfer function. . . . . . . . . . . . . 86
5.3 T rimhf effect on the CTLE transfer function. . . . . . . . . . . . . . . . 87
5.4 T rimbias−hf effect on the CTLE transfer function. . . . . . . . . . . . . 87
5.5 T rimres−hf effect on the CTLE transfer function. . . . . . . . . . . . . . 88
5.6 T rimdc effect on the CTLE transfer function. . . . . . . . . . . . . . . . 88
5.7 T rimbuf 1 effect on the CTLE transfer function. . . . . . . . . . . . . . . 89
5.8 T rimbias−buf 1 effect on the CTLE transfer function. . . . . . . . . . . . . 89
5.9 T rimres−buf 1 effect on the CTLE transfer function. . . . . . . . . . . . . 90
5.10 T rimmf effect on the CTLE transfer function. . . . . . . . . . . . . . . 90
xiii
5.11 T rimbias−mf effect on the CTLE transfer function. . . . . . . . . . . . . 91
5.12 T rimmf cap effect on the CTLE transfer function. . . . . . . . . . . . . . 91
5.13 T rimbuf effect on the CTLE transfer function. . . . . . . . . . . . . . . 92
5.14 Eye diagram at the output of HF-CTLE (inverter based) for 40 Gbps
PAM-4 data rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.15 Eye diagram at the output of MF-CTLE (inverter based) for 40 Gbps
PAM-4 data rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.16 Eye diagram at the output of complete CTLE (inverter based) for 40 Gbps
PAM-4 data rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
xiv
List of Acronyms
ADC Analog to Digital Converter
ASIC Application Specific Integrated Circuit
BER Bit-Error Rate
BEOL Back-End-Of-Line
BGA Ball Grid Array
C4 Controlled Collapse Chip Connection
DCD Duty-Cycle Distortion
CDR Clock and Data Recovery
CML Current-Mode Logic
CRU Clock Recovery Unit
CTE Coefficient of Thermal Expansion
CTLE Continuous-Time Linear Equalizer
DFE Decision Feedback Equalizer
DNL Differential Non-Linearity
EMI Electromagnetic Interference
xv
FEXT Far-end Crosstalk
FFE Feed-Forward Equalizer
HBM High-Bandwidth Memory
HMC Hybrid Memory Cube
IIR Infinite Impulse Response
ILO Injection Locked Oscillator
INL Integral Non-Linearity
ISI Inter-Symbol Interference
LDO Low-Dropout
LSB Least Significant Bit
MCM Multi-Chip Module
MSB Most Significant Bit
NEXT Near-end Crosstalk
NRZ Non-Return to Zero
PAM Pulse Amplitude Modulation
PCB printed circuit board
PD Phase Detector
PDN Power Distribution Network
PI Phase Interpolator
PRBS Pseudo-Random Binary Sequence
xvi
PSIJ power supply induced jitter
RDL ReDistribution Layer
RF Radio Frequency
SEM Scanning Electron Microscopy
SSO Simultaneous Switching Output
SST Source-Series-Termination
ToF Time-of-Flight
TSV Through Silicon Via
UBM Under Bump Metalization
UI Unit Interval
VCO Voltage Controlled Oscillator
VCSEL Vertical-Cavity Surface-Emitter-Laser
USR Ultra-Short Reach
VSR Very-Short Reach
xvii
1 Introduction
1.1. Motivation
There has been a continuous demand for increase in data consumption over the last
several years all over the world. Figure 1.1 shows the trend in which global traffic has
increased over the years. This demand comprises of consumer and business applications
in the form of multimedia streaming, audio-video conferencing, internet surfing, cloud
storage and other digital enterprise applications. As per the Cisco 2020 report, there will
be nearly 5.3 billion total internet users (66 percent of global population) by 2023, up from
3.9 billion (51 percent of global population) in 2018 [1]. There will be ∼ 150 % increase in
networked devices per capita by the year 2023 [1]. As a result, the need for data centers
will rise exponentially. Especially, with the ongoing pandemic of corona-virus (COVID-
19), businesses have become more dependant on cloud services. Foreseeing this situation,
several big tech companies are investing heavily in worldwide digital transformation. This
increasing need for data centers and cloud resources has led to the development of large-
scale public cloud data centers called hyperscale data centers [1]. It is expected that by
2021 there will be ∼ 628 hyperscale data centers globally, compared to 338 in 2016 [1].
The power consumption within the data centres has been consistent at a rate of
nearly 1 % of the global electricity consumption since 2012 despite the continuous increase
in demand for data [3]. This has been achieved by improving the efficiency in power
1
Chapter 1. Introduction 2
Global data traffic trend over years

23
20
18
Zetabytes per year
15
12
0
2,013 2,014 2,015 2,016 2,017 2,018 2,019 2,020 2,021
Years
Figure 1.1: Global data traffic over years [2].
consumption due to transistor scaling in newer technologies. However recent trends
indicate that transistor scaling no longer gives us more speed as per data demand, nor
it can offset the further increase in power consumption by data centres. It is predicted
that if data centres scale up to meet the data demand, they are likely to consume about
8 % of the global electricity consumption by 2030. This increased power consumption
will also result in increased undesirable carbon emissions [3]. Hence, the present research
focuses on developing high-speed, low power and low cost devices in data centres with
good signal integrity.
Data centres comprise of routers, switches, storage, servers, firewalls and network-
ing gear. These components exist in the racks and they are connected to each other
through wireline links and need SERDES (Serializer De-serializer) transceivers for inter-
communication. SERDES is a preferred choice as it can do efficient data communication

with fewer cables, low power, less number of ground isolators, terminations and low
connector costs. Also, fewer cables help in better air-flow among components and subse-
quently reduces the cooling requirement for the servers. The wireline link can be treated
as a channel with limited bandwidth, reflections and crosstalk issues. Apart from it, the
performance of SERDES transceiver is also limited by noise, linearity, jitter, clock and
data skew and bandwidth. At present, 112 Gbps links have been developed by several
companies (Intel, Xilinx, Rambus, etc.) and still research is ongoing to find more power
efficient wireline links for 100+ Gbps data rate. OIF (Optical Internetworking Forum)
has characterized different channels based on their length or the attenuation they expe-
rience. These different types of channels are commonly categorized as MCM (multi chip
module), XSR (extra short reach), VSR (very short reach), MR (medium reach) and LR
(long reach) [4]. Application of these channels is shown in Figure 1.2. This research
primarily focuses on the development of SERDES for LR channels. The LR channel
considered in this research project is a backplane (may actually be a cable) with length
∼ 1m and offers channel attenuation of ∼ 30 dB (excluding package loss) at 28 GHz
Nyquist frequency.
OIF CEI-112G Development Application Space
CNRZ-5: up to 25mm package substrate
CEI-112G-MCM No equalization/FEC
3D Stack 2.5D Chip-to-Chiplet Minimize power (pJ/bit)
PAM4: up to 50mm package substrate

Chip Optics
CEI-112G-XSR 6-10 dB at 28GHz
2.5D Chip-to-Chip Chip to Nearby Optics Engine Lite FEC, Rx CTLE
Pluggable PAM4: 12-16 dB at 28GHz

Chip
CEI-112G-VSR Optics FEC to relax BER to 1e-6
Chip to Module Multi-tap Tx FIR and Rx CTLE + multi-tap FFE or DFE
PAM4: 20dB at 28GHz

Chip Chip
CEI-112G-MR FEC to relax BER to 1e-6
Chip-to-Chip & Midplane Applications Multi-tap Tx FIR and Rx CTLE + multi-tap FFE or DFE
PAM4: 28-30dB at 28GHz

Chip Chip
CEI-112G-LR FEC to relax BER to 1e-4
Backplane or Passive Copper Cable Multi-tap Tx FIR and Rx CTLE + multi-tap FFE or DFE
• PAM4 modulation
Figure scheme
1.2: OIF becomesDevelopment
CEI-112G dominant in OIF CEI-112 Gbps
Application interface
Space [4]. IA
• One SerDes core might not be able to cover multiple applications from XSR to LR
• For short reach applications, simpler and lower power equalizations are desired
Copyright © 2020 OIF 6

1.2. Typical SERDES system
TX Data RX Data
Serializer
TX Channel RX
De-serializer
Rt Rt
PLL CDR
Reference
clock
Figure 1.3: Typical SERDES transceiver system.
The block diagram of a typical SERDES transceiver system is shown in Figure 1.3.
Parallel data stream goes into the serializer, which feeds the serialized data into the trans-
mitter block. PLL (Phase Locked Loop) on the transmitter (TX) side scales up the clock
frequency for the circuit as compared to low frequency external reference clock. Several
data encoding techniques are used to modulate the data in the digital domain prior to
the TX. The driver sends out the signals over the channel. The received analog signal
gets attenuated by the channel and corrupted due to the noise and interference. Both
TX and receiver (RX) should be terminated properly with the characteristic impedance
of the channel transmission line to reduce reflections.
The receiver sub-system block includes RX core, CDR (Clock and Data recover block)
and de-serializer. RX consists of equalization circuits (discussed in next section in detail)
and DSP (Digital Signal Processing) unit to compensate for the signal attenuation. A
CDR block at the receiver end helps to recover the clock frequency. The CDR block
eliminates the need to feed-forward the clock signal through an additional channel from
TX to RX. This has several advantages such as it is cost-effective as there is no need for an
additional channel. Also, the clock can experience noise and attenuation like the received
data stream and can also get skewed over the channel. Once the receiver produces the
digital data, the information can be retrieved by de-coding and de-modulation. Finally,
de-serializer sends out the parallel data [5].
1.3. Thesis organization
The remaining thesis is organized as follows:
• Chapter 2 - Background: It provides the information about data modulation
schemes and the need for equalization. It describes the latest trend in SERDES
transceiver development with an emphasis on the state of the art Continuous Time
Linear Equalizers (CTLE) . Later, it specifies the scope of this thesis.
• Chapter 3 - Receiver Analog Front-end Design: This chapter discusses the
design considerations of the analog front end design in detail focusing towards low
power operation while maintaining the key performance parameters.
• Chapter 4 - Extracted Simulation Results: This chapter presents the sim-
ulation results for extracted netlist of top-level layout. Further it compares the
performance of the inverter based CTLE with other state of the art CTLEs and
the conventional CML based CTLE.
• Chapter 5 - Conclusion: This chapter summarizes the thesis and discusses about
the future directions for improvement in analog front end design.

2 Background
2.1. Data modulation techniques
The binary data is generally encoded with one of three modulation schemes - PAM-2,
PAM-4 or Duobinary. PAM-2 (Pulse Amplitude Modulation with 2 levels) is also known
as NRZ (Non Return-to-Zero) and maps the data bit to two symbols as +Vp and −Vp
voltage levels. The PAM-4 modulation scheme combines the two consecutive bits into
+Vp −Vp
one symbol, resulting in four possible symbols or voltage levels (i.e. +Vp , 3
, 3 and
−Vp ). So, PAM-4 has an advantage to transfer double data rate with same symbol rate
as compared to PAM-2. But, there are certain disadvantages associated with PAM-4,
which mainly includes less SNR (signal to noise ratio) and strict linearity requirements
(beacause of the tigher spacing between voltage levels in PAM-4 as compared to PAM-2)
In Duobinary modulation, the modulator output is w[n] = x[n]+x[n−1], where x[n] is
the present TX data bit and x[n−1] is the previous data bit . In other words, the present
bit at the modulator output is sum of present and previous data bit. So, it has some
controlled ISI (Inter-Symbol Interference), which leads to low bandwidth requirement
during equalization. However, there is a trade-off , design complexity of such modulators
is comparatively high and it requires additional circuitry for detection/decoding of the
received signals [6].
6
Chapter 2. Background 7
2.2. Need for equalization
All wireline links act as low pass filter by nature and exhibit significant attenuation at
higher frequencies. As a result, the higher frequency signals are more attenuated and
delayed as compared to lower frequency signals, which leads to ISI. ISI is the interference
experienced by the current symbol that is caused by the previously transmitted symbol.
The task of an equalizer is to invert the channel’s frequency response and provide a flat
symbol rate
frequency response till Nyquist frequency (f = 2
). Thus, most of the frequency
components in the input signal will have same delay and gain/attenuation, leading to
low ISI in time domain.
As an example, an LR channel model (IEEE channel) [7]) is chosen for analysis for
56 Gbps PAM-2 PRBS (Pseudo Random Bit Sequence) with PN-32 data sequence input
data. Figure 2.1 shows the channel has an attenuation of ∼ 30 dB at 28 GHz Nyquist
frequency. An approximate model of the CTLE in ideal scenario is realized which provides
∼ 18 dB equalization against the requirement of 30 dB equalization. This amount of
linear equalization is practically realizable and beneficial (explained in the next section).
Remaining equalization is expected from the further equalization stages.
The time-domain impact of this equalizer can be seen in Figure 2.2 through the
pulse response at the channel and CTLE output. The channel’s pulse response for 56
Gbps bit width shows a lot of pre and post cursors which leads to ISI and are thus
undesirable. Ideally, we expect the pulse response to be a dirac-delta function. The
equalizer model considered previously works fairly well in removing several pre and post
cursors. Theoretically, the input signal to a channel convolves with its impulse response
to produce the time domain output. The best way to visualize the medium’s time domain
response is through an ‘eye-diagram’. An eye-diagram is realized by clipping the received
signal into one unit interval time frame and superimposing each clipped block over each
other. In the absence of ISI, we expect the eye to be wide-open, which corresponds to a
Magnitude response CTLE
20
CHANNEL
CTLE
CHANNEL+CTLE
10
0
|H(f)| dB
-10
-20 CHANNEL GAIN @ 28GHz = -29.85 dB
CTLE GAIN @ 28GHz = 7.75 dB
-30 CHANNEL+CTLE GAIN @ 28GHz = -22.10 dB
-40
10 7 10 8 10 9 10 10
frequency (in Hz)
Figure 2.1: Backplane channel and CTLE magnitude response.
low bit-error-rate (BER). Figure 2.3 shows the effect of the equalizer in opening the eye
as compared to the entirely closed eye at the channel’s output.
Equalization can be achieved in either the continuous time or discrete-time mode. It
can also be categorised into linear or non-linear mode of operation. The equalizer model
explained earlier is an implementation of linear equalizer in continuous time mode . The
issue with such equalizer (CTLE) is that it not only boosts high frequency signal content
but also amplifies the noise. This poses a limitation on the amount of equalization that
can be achieved through linear analog equalization. The discrete-time equalizer needs a
clock signal and is jitter sensitive. So for these equalizers, it is expected to have minimal
eye opening at their input in order for them to function properly. Typical examples of
discrete-time equalizers are DFE (Decision Feedback Equalizer) and FFE (Feed-Forward
Equalizer). The advantage with these equalizers is that they are easily programmable.
10 -3 Pulse Response
12
CHANNEL OUTPUT
CHANNEL+CTLE OUTPUT
10
Pre[1] = 19.57 % Post[1] = 21.36 %

8 Pre[2] = -2.32 % Post[2] = -2.51 %
Pre[3] = -3.60 % Post[3] = -3.35 %
Amplitude (in volts)
Pre[4] = -0.08 % Post[4] = -0.45 %

Pre[5] = -1.21 % Post[5] = -1.00 %
6 Pre[6] = 1.61 % Post[6] = 1.58 %
Pre[7] = -0.77 % Post[7] = -0.86 %
Pre[8] = 1.37 % Post[8] = 1.42 %
4 Pre[9] = -0.34 % Post[9] = -0.32 %
Pre[10] = 1.15 % Post[10] = 1.01 %
Post[11] = -0.14 %
Post[12] = 0.60 %
2
-2
6.7 6.8 6.9 7 7.1
Time (in sec) 10 -9
Figure 2.2: Pulse response at the channel and the CTLE output (with partial equalization).
DFE in particular, is a non-linear discrete time equalizer and it does not amplify the
noise whereas linear equalizers do amplify the noise.
2.3. Recent trend in the development of LR 112 Gbps SERDES
The choice of data modulation is extremely important for LR channels. Traditionally,
PAM-2 modulation technique used to be the preferred choice for data rates less than 56
Gbps, since it is more immune to noise and has lenient linearity requirements. But for
higher data rates (112 Gbps) and with higher attenuated channels, PAM-4 modulation
is preferred [8]. It is due to the fact that the Nyquist frequency in the case of PAM-4
data stream is the quarter of the data rate while in the case of PAM-2 modulation, it is
half of the data rate. If we look at the magnitude response of the LR channel in Figure
(a) (b)
Figure 2.3: Eye diagram: (a) at channel output; (b) at CTLE output with partial equalization.
2.1, the equalizer needs to equalize till 56 GHz in case of PAM-2 modulation whereas in
the case of PAM-4, it is 28 GHz. Since the channel roll-off is steep at higher frequencies,
PAM-4 is the preferred choice. However, there are certain drawbacks associated with it
which have been discussed in Section 2.1.
For an LR channel, single equalization stage is not capable enough to fully equalize the
channel. Hence, equalization is achieved in multiple stages. Now, an important question
to pose here is how much equalization should be achieved through each stage in order
to have minimum area and power consumption while meeting adequate performance?
Several circuit blocks have been reported in the literature in this regard [9].Generally,
some fraction of equalization in carried out on the TX side while remaining equalization
is done on the RX side. Table 2.1 shows that most of the equalization is done on the
analog side for lower data rates [9]. However, with higher data rates (of 100+ Gbps),
designers are moving towards an ADC based receiver which digitizes the analog signal
using ADC post the CTLE stage and carries out digital signal processing to achieve
the remaining equalization as shown in PPA (Power, Performance and Area) analysis
in Table 2.2 [9]. This has been enabled due to the low power consumption for the
digital circuits realized via transistor scaling. Figure 2.4 shows the conventional SERDES
transceiver architecture for lower data rates while Figure 2.5 shows a typical modern
SERDES transceiver architecture for higher data (100+ Gbps).
Table 2.1: Equalization component PPA analysis with data rates ≤ 56 Gbps [9].
Power Performance Area

TX FIR 3 3 ◦
RX CTLE /
3 3 3
VGA
RX Analog
◦ 3 ◦
DFE
RX Analog
7 7 7
FFE
RX ADC 7 ◦ 7
RX DSP
7 3 7
FFE/DFE
• Legends: 3: good 7: poor ◦ : fair/medium
• ADC consumes power and area at larger process nodes.
Table 2.2: Equalization component PPA analysis with data rates ≥ 56 Gbps [9].
Power Performance Area

TX FIR 3 3 ◦
RX CTLE /
3 3 3
VGA
RX ADC ◦ ◦ 7
RX DSP
3 3 3
FFE
RX DSP
7 3 7
FFE
• Legends: 3: good 7: poor ◦ : fair/medium

TX Data RX Data
TX
De-
Driver Analog
Serializer TX-FFE CTLE serializer
DFE
Rt
Channel
Reference clock
PLL
CDR
Transmitter Receiver block
Figure 2.4: Conventional equalization architecture for data rates ≤ 56 Gbps (with equalizers
highlighted in pink colour).
TX Data RX Data
TX DSP
De-
Driver FFE/
Serializer TX-FFE CTLE ADC serializer
DFE
Rt
Channel
Reference clock
PLL
CDR
Transmitter Receiver block
Figure 2.5: Next generation equalization architecture for higher data rates (100+ Gbps) (with
equalizers highlighted in pink colour).
2.4. State of the art : Previous CTLE architectures
The pre-ADC equalization becomes very important in modern SERDES architecture.
Its function is not to fully equalize the channel but to relax the ADC and DSP design
requirements [10]. PMR (Peak to main signal ratio) of a channel is defined as the ratio of
the sum of all cursors to the main cursor in the pulse response [11]. Frans et al reported
that a 6 dB improvement in PMR helps to save one bit of the ADC resulting in better
power performance [12]. It also reduces the RX-FFE noise boosting due to the reduced
FFE coefficients [10].
The conventional CML (current mode logic) based CTLE architecture is shown in
Figure 2.6. The pole and zero locations of the CTLE are controlled by the source degen-
eration impedance. This architecture has been widely adopted over the past years [13–16].
But with the scaling of the transistors and the reduced power supplies, CML based ar-
chitecture does not deliver power efficient solution. Moreover, this architecture utilizes
passive inductors to push the bandwidth, which results in higher area consumption.
OUT- OUT+
IN+ IN-
Figure 2.6: CML based conventional CTLE architecture.
In order to save some inductor area and extend the bandwidth, negative capacitance
circuit (NCC) has been used to reduce the load capacitance as shown in Figure 2.7 [17].
But this solution

TH ADAPTIVE CASCADED is again power hungry and challenging to realize high frequency6100509
EQUALIZER negative
capacitance.
Figure
Fig. 6.2.7: Schematic
CML based of
conventional
the CTLE CTLE
witharchitecture with negative
both inductive peakingcapacitance
and NCC.implemen-
tation [17].
uctance de- However, CMOS inverter based architecture (shown in Figure 2.8) does in fact take
Hz) output C. Three-Stage Cascaded CTLE With Different
the advantage of technology scaling and can offer power-efficient solutions (explained
capacitance Peaking Frequencies
in the next chapter). It generally includes a cascade of passive equalizer and an active
e noise de- The conventional CTLE filter has been studied clearly [26].
amplifier. It has shown potential benefits in terms of area and power reduction [18], [19].
noise con- With the capacitive degeneration, this CTLE circuit achieves a
bution from A typical inverter based CTLE architecture [19] is shown in Figure 2.9.
boosting at high frequencies with the sacrifice of DC gain. The
high qual- transfer function is [26]
not signifi-
V Passiveg R 1 + ωsz 1
nce of M9 IN
out m
(s)Equalizer
=
1 D

OUT
(4)
Vin g R
1+ 2 m 1 S s
1 + ω p 1 RL1 + CLs
ωp 2
TIA is vul- Active
fore, a fully where ωz 1 = 1/(RS CS ), p1 ωAmplifier
+ g m 12R D )/(RS CS ),
= (1
required to ωp2 = 1/(RD CL ), and gm 1 is the transconductance of M1 .
[24]. To en- This topology suffers
Figurefrom limited
2.8: CMOS bandwidth
inverter based CTLE.and consequently
e supply of insufficient boosting at high frequencies. Therefore, inductive

peaking was introduced into this topology as shown in Fig. 6(b).
The transfer function is [26]
transconductances, (gm1+gm2)/(2gml). The coupling capacitor
C is implemented with a fingered MOM device. Active
inductors
Chapter 2. are used in both low- and high-frequency paths for
Background 15
F
gm2 gml
C v
re
th
in
gm1 gml
Figure 2.9: CMOS inverterNyquist

based CTLE
gainfor PAM-2
~ (g +g application
m1 )/(2g ) [19].
m2 ml
te
p
There is another CTLE architecture proposed in [20] which makes use of a combina- c
tion of both CML and CMOS inverter based designs (shown in Figure 2.10). This design
m
O
has good CMRR (Common mode rejection ratio) and low power consumption. The front
p
high pass filter eliminates low frequency content and sets the low cutoff frequency for
M-4 42.5-dB IL ADC/DAC-BASED TRANSCEIVER st
input data stream. DC gain ~ gm1/gml 28GHz m
Fig. 4. Single-ended CTLE schematic and simulated frequency responses. N
Authorized licensed use limited to: The University of Toronto. Downloaded on Augu
Figure 2.10: CML and CMOS inverter based CTLE [20].
Fig. 3. CTLE Gm (a) and TIA (b) stages.

2.5. Thesis scope
Driven by the trending research in the high speed SERDES development, this thesis
focuses on developing a new CMOS inverter based architecture for analog front end design
of a long-reach 112 Gbps PAM-4 SERDES receiver. It should provide energy-efficient
equalization solution while maintaining satisfactory noise performance, linearity, power
supply noise rejection and programmability options. The design will be implemented in
16nm FinFET CMOS technology which will land on a flip-chip BGA (Ball Grid Array)
package substrate.
Receiver Analog Front-end
3 Design
3.1. Long-reach channel
In this project, the RX application is designed for 1m long backplane channel, which
provides ∼ 30 dB attenuation at the Nyquist frequncy of 28GHz. Frequency and pulse
response characteristic of this long reach channel model (provided by ieee803.ck) for
an ideal termination of 50 Ω is shown in Figure 3.1. However, when the channel is
connected with the input of the receiver, it no longer remains ideally terminated and
exhibits an attenuation ∼ 36 dB due to the package and on-chip parasitics. To improve
the additional attenuation due to package and on-chip parasitics, a front-end termination
network is used so that there is minimal addition attenuation at high frequencies.
3.2. Front-end termination network
The front-end termination block is the portion of the receiver from the channel output to
the CTLE input. The receiver input can have DC or AC coupling from TX through the
channel depending on the application. In our case (AC coupling), there is ≈ 1µF off-chip
coupling capacitor, which ensures to block signals only below 3.2 KHz as per equation
3.1. However, this large capacitor will have parasitic inductance and own self-resonance
frequency. The latter should be much greater than the Nyquist frequency. If this does
not happen, we can decrease the value of off-chip capacitor provided it does not hurt
17
Chapter 3. Receiver Analog Front-end Design 18
Channel magnitude response

10
Nyquist Frequency = 28 GHz

0
-10
-20 CHANNEL GAIN @ 28GHz = -29.85 dB

|H(f)| dB
-30
-40
-50
-60
10 7 10 8 10 9 10 10
frequency (in Hz)
(a)
9
CHANNEL OUTPUT
8
6
-1
6.7 6.8 6.9 7 7.1
Time (in sec) 10 -9
(b)
Figure 3.1: LR channel characterization with ideal 50 Ω termination: (a) Magnitude response
shows 30 dB loss at 28 GHz Nyquist frequency; (b) Pulse response for pulse width corresponding
to 56 Gbps data-rate; showing the pre, post and main cursors.
lower cut-off frequency intended for our application.
1
fz ≈ (3.1)
2πRt Cof f −chip
In high speed SERDES, the frequency of the input data stream does not go that low
due to 8b/10b or 64b/66b coding, which is required to avoid baseline wandering due to
long stream of 1s or 0s. AC coupling enables the designer to set the input common-mode
voltage level of the CTLE. The front-end termination block is shown in Figure 3.2.
k = 0.6
≈ 1 uF off-chip cap 180 pH 73 pH CTLE OUT+
IN+ Rt = 50 Ω
Cesd = 70 fF
Cpad = 100fF
Lt = 500 pH
CM bias from CTLE
Lt = 500 pH
CTLE
k = 0.6
Rt = 50 Ω
IN- 180 pH 73 pH
≈ 1 uF off-chip cap CTLE OUT-

Cesd = 70 fF
Cpad = 100fF
Front-end termination network
Figure 3.2: Front-end termination network.
The signal enters the CMOS die through C4 solder bump pads, which then passes
through the Tcoil, ESD cell and other passive circuitry for providing proper termina-
tion with channel’s characteristic impedance (≈ 50 Ω). The explanation about various
components in this network is given below -
Parasitic capacitors: The bump pad area is ≈ 80 µm x 80 µm. Upon extraction

in layout and having some safety margin, pad capacitance is nearly 100 fF. Additionally,
there is always some level of HBM (Human Body Model), CDM (Charged Device Model)
and MM (Machine Model) ESD protection requirement. Since the input signal is a high
speed path, there is only primary ESD diode for protection. The ESD capacitance value
is ≈ 70 fF upon extraction. In addition to these parasitic capacitances, there is an
additional loading by CTLE input. These parastitics further degrade the transmission
(S21 ) response of the channel. However, with the help of Tcoil and another inductor Lt ,
we have boosted the signal and provided passive equalization.
Bandwidth enhancement with T-coil: T-coil is an old technique developed in
1920s to extend bandwidth of the circuit. The design and usage of T-coil is well explained
in [21]. Since ESD and bump pads are unavoidable, T-coils have proved to be very
beneficial in extending the bandwidth of the circuit in [22–25]. The circuit diagram
shown in Figure 3.2 shows the parameters of an asymmetric T-coil chosen for this design.
The use of T-coil is advantageous because the capacitors associated with the middle node
can be nullified to a certain extent by mutual coupling between the split inductors. The
functioning of T-coil network is explained in [23]. The parameters of the T-coil are chosen
such that it provides an adequate magnitude response at the CTLE input in addition to
the termination requirements.
Inductor load, Lt for passive equalization: Generally the use of only T-coil
is mentioned in research papers for bandwidth extension. For data rate exceeding 100
Gbps, we have to provide additional boost by using a passive inductor Lt . The need for
T-coil as well as additional inductor Lt can be understood from the waveforms shown in
Figure 3.3. It is shown that if the channel is terminated with only ideal resistor = 50
Ω (which matches with the characteristic impedance of the channel), then the channel’s
magnitude response shows ≈ 30 dB attenuation at Nyquist frequency. If the channel is
terminated with a network similar to shown in Figure 3.2 but without any T-coil and Lt ,
then the magnitude response at CTLE input drops to ≈ -35 dB. The inclusion of T-coil
to that network improves the magnitude response by 2 dB and finally with the proposed
architecture (with T-coil and Lt included), the magnitude response improves further by
3 dB and becomes almost similar to the case if the channel is terminated ideally.
Magnitude response at CTLE input

0

-5
-10
-15
|H(f)| dB
GAIN @ 28GHz = -29.67 dB
GAIN @ 28GHz = -35.22 dB

-20
GAIN @ 28GHz = -33.29 dB
-25 GAIN @ 28GHz = -30.63 dB
Channel is terminated ideally with Rt = 50 ohms only

-30 No Tcoil and Lt but ESD, PAD and RX inputs connected
no Lt but Tcoil, ESD, PAD and RX inputs connected
Proposed termination network response
-35
10 7 10 8 10 9 10 10
frequency (in Hz)
Figure 3.3: Magnitude response from channel input to CTLE input due to various termination
circuits. Proposed architecture behaves closest to the case when the channel is terminated
ideally.
The transfer function of the proposed network is shown in Equation 3.2, where Lt
term in the numerator helps to have sufficient gain at high frequencies. The derived
transfer function ignores parasitic capacitance of the elements in the proposed network.
In addition, the CTLE input impedance is assumed to be very high than Rt value.
The approximated transfer function provides good understanding of the network. The
magnitude response of this transfer function for different Lt values is shown in Figure 3.4.
The overall impact of the different values of Lt on the channel and the front-termination
network is shown in Figure 3.5. Finally a value of Lt = 500 pH is chosen that provides
the required passive equalization. It should be taken care that the self-resonance of this
inductor is greater enough than the Nyquist frequency. The inductors and T-coils in this
project are laid out in M10-M11 and then their extracted models are generated using
EMX simulation tool.
−Cesd Lt M s3 − Cesd M Rt s2 + Lt s + Rt
H(s) =
(−Cesd M 2 + Cesd L1 L2 + Cesd L1 Lt )s3 + Cesd L1 Rt s2 + (L1 + L2 + Lt + 2M )s + Rt
(3.2)
AC gain of front-end-termination network

2

0
-1
-2 GAIN @ 28GHz = -4.90 dB

|H(f)| dB
-3 GAIN @ 28GHz = -4.25 dB
-4 GAIN @ 28GHz = -3.13 dB
-5 GAIN @ 28GHz = -1.92 dB

Lt = 0 pH
-6 Lt = 100 pH GAIN @ 28GHz = -0.85 dB
Lt = 200 pH
-7 Lt = 300 pH GAIN @ 28GHz = 0.04 dB
Lt = 400 pH
-8 Lt = 500 pH GAIN @ 28GHz = 0.77 dB
Lt = 600 pH
-9
10 9 10 10
frequency (in Hz)
Figure 3.4: Magnitude response of front termination network from the channel output to the
CTLE input with variation in Lt for the proposed architecture.
Magnitude response at CTLE input

0

-5
-10
-15 GAIN @ 28GHz = -33.29 dB

|H(f)| dB
GAIN @ 28GHz = -33.06 dB

GAIN @ 28GHz = -32.43 dB
-20
GAIN @ 28GHz = -31.73 dB
GAIN @ 28GHz = -31.12 dB
Lt = 0 pH
-25 Lt = 100 pH GAIN @ 28GHz = -30.63 dB
Lt = 200 pH GAIN @ 28GHz = -30.24 dB
Lt = 300 pH
-30 Lt = 400 pH
Lt = 500 pH
Lt = 600 pH
-35
10 7 10 8 10 9 10 10
frequency (in Hz)
Figure 3.5: Magnitude response from channel input to the CTLE input with variation in Lt
for the proposed architecture.
3.3. CTLE design
The analog front end block diagram is shown in Figure 3.6, which includes the front pas-
sive termination block (already discussed), CTLE core, an output buffer to take the high
speed signals off-chip. and finally a back-end passive network to extend the bandwidth
using T-coil. The CTLE core consists of three stages namely high frequency boost stage,
mid-band boost stage and final buffer stage to drive a 100 fF capacitive load. This ca-
pacitive load is the expected load for the front-end sampler of a 64-way time-interleaved
ADC. There is an ADC as a following block according to modern SERDES architecture
for LR channel, but we have limited our research to the CTLE design only.
The CTLE design is implemented using CMOS tristate inverter as driver and active
load in certain stages. The tristate nature of the inverter allows better tunability options,
which are discussed later in this chapter. The CMRR of a CMOS inverter amplifier
(generally CMRR ≤ 0 dB ) is worse than that of a current source based differential

Common-mode
feedback amplifiers
Forced common-
mode voltage
Sensed common-
mode voltage
IN+
OUT+
High-frequency Mid-band
Load Output
Buffer-stage capacitor
Analog CTLE CTLE Buffer Back-end
Front-end Passives
termination
OUT-
IN-
CTLE
Figure 3.6: Analog front end block diagram showing three stage CTLE.
amplifier (CMRR >> 0 dB and ideally CMRR = ∞ for ideal current source). In addition,
this CTLE architecture needs to have a proper common-mode bias voltage to ensure
adequate biasing for amplification. There is a feedback loop to provide common-mode
bias voltage and and to improve the CMRR and PSRR (Power Supply Rejection Ratio).
The benefits and limitations of the feedback loop are discussed in Section 3.3.5.
3.3.1. Basic design elements used in CTLE
The CTLE architecture uses certain basic elements multiple times, which include CMOS
tristate inverter as an amplifier, a diode-connected load and an active inductor. Inverter
is used extensively in digital circuits however they can be used as good analog building
blocks provided they are biased properly. the strength of these cells are made tunable
using enable and disable switches.
3.3.1.1. CMOS tristate inverter as a basic amplifier unit
This section will discuss the reasons for choosing to use CMOS tristate inverter cell
(shown in Figure 3.7) as a basic unit instead of the conventional current source based
differential amplifier (shown in figure 3.8). The primary advantage of using a differential
amplifier is that it has better CMRR as compared to the CMOS inverter operating as
an amplifier. But there are certain disadvantages too, which have been discussed later
in detail in this section only.
The small-signal model of the circuit shown in Figure 3.8 can be analysed to find
small-signal DC gain described in Equation 3.3.
+ −
Vout − Vout Vout − (−Vout ) Vout Z
Av = + − = = = gmn · ( k rds,n ) (3.3)
Vin − Vin Vin − (−Vin ) Vin 2
Further since in CMOS FinFET technology, the mobility, strength and other param-
eters of the PMOS and NMOS are almost the same, it is fair to assume that gmn ≈
gmp = gm and rdsn ≈ rdsp = rds for the NMOS and PMOS respectively. Also, with
the assumption that Z rds, the overall DC gain of the differential amplifier can be
evaluated using Equation 3.4.
Z
Av (differential amplifier) ≈ gm · (3.4)
2
However, the DC gain for small-signal model of CMOS inverter amplifier (shown in
Figure 3.7) is given in Equation 3.5.
Av (inverter amplifier) ≈ gm · Z (3.5)
Advantages of CMOS inverter as an amplifier over differential amplifier:
It is evident from the DC gain equations that for the same bias current, CMOS inverter
cell provides twice the gain as compared to the differential amplifier. This high gain
results from the transconductance which is twice in the case of CMOS inverter. Another
limitation of the differential amplifier is that it requires higher supply voltage to keep its
three-stack transistors in saturation that too with some margin. On the other side, the
inverter cell is a 2-stack transistor architecture and needs less supply voltage for adequate
biasing. With the transistor scaling, the supply voltages are becoming less than 1 V and
VCC
Ron ≈ 0 Ω
EN PMOS switch
-Vout +Vout
+gm_p*Vin rds_p rds_p -gm_p*Vin
-Vout +Vout
Z
+Vin -Vin +gm_n*Vin rds_n rds_n -gm_n*Vin
Z
CMOS Ron ≈ 0 Ω
inverter
NMOS switch +Vout

EN
(gm_n+gm_p)*Vin Z
CMOS tristate inverter
-Vout
Small-signal model
(assuming Z << rds_n ≈ rds_p
Figure 3.7: Analysis of CMOS tristate inverter cell as an amplifier.
VCC
-Vout Z Z +Vout
Z Z 2 2
2 2 +gm_n*Vin rds_n -gm_n*Vin rds_n
-Vout +Vout
+Vin -Vin
+Vout
gm_n*Vin Z
-Vout
Differential amplifier Small-signal model (assuming Z << rds_n)
Figure 3.8: Analysis of CML based differential amplifier.

high-stack architectures are difficult to design and becoming obsolete. So, these two
are the primary reasons for choosing inverter cell as a low-power amplifier. In addition,
the enable/disable feature in the tristate CMOS inverter unit provides a tunability knob
which allows to have a power-scalable design. This has been used extensively throughout
the whole design.
However, the CMOS inverter has some disadvantages in terms of PSRR and CMRR.
Another disadvantage is that the common-mode level at the inverter inputs and out-
puts is mid-rail, which is incompatible with many/most other analog amplifier stages
under low supply voltages, such as the source-follower, common-source, most high-speed
comparators etc.
VCC
VCC
PMOS
PMOS switch
EN switches EN
-Vout +Vout -Vout +Vout
+Vin -Vin +Vin -Vin

Z Z
CMOS
inverter
EN NMOS EN NMOS switch

switches
CMOS tristate inverter-1 CMOS tristate inverter-2
Figure 3.9: CMOS tristate inverter configuratios as an amplifier.
Further, there can be different architectures for an inverter-cell as an amplifier as well.
Two of them are shown in Figure 3.9. The CMOS tristate inverter-1 architecture provides
lower transconductance gm due to the source degeneration caused by the resistance of
the switch in ON mode. On the other hand, the CMOS tristate inverter-2 architecture
provides higher transconductance gm for the same bias current as there is no source
degeneration because the common source node is virtually grounded. Hence, the CMOS
tristate inverter-2 architecture has been adopted as the basic unit in this work. Inverter-2
architecture in this work is one of the primary differences between this design and that
in [10] where Inverter-1 architecture was used.
There can also be a third type of amplifier which is shown in Figure 3.10. In this con-
figuration, the main driving transistors and the switch transistors are swapped. But this
architecture requires large switch sizes and has more parasitics. Thus, it is discouraged
for our intended application.
VCC
PMOS
EN switches
-Vout +Vout
+Vin -Vin
NMOS
switches
EN
CMOS tristate inverter-3
Figure 3.10: CMOS tristate inverter architecture with switches away from ends.
Since, the inverter driver will be used several times in the design, its symbolic repre-
sentation alongwith its circuit diagram is shown in Figure 3.11 for better understanding.
VCC
fingers = 4*m
EN
fins = 8 PMOS switch
len = 16nm
+Vin -Vout
fingers = m fingers = m
fins = 4 fins = 4
len = 16nm len = 16nm
m +Vin -Vout +Vout -Vin
-Vin +Vout fingers = m fingers = m

fins = 4 fins = 4
len = 16nm len = 16nm
fingers = 4*m
Symbol EN fins = 8 NMOS switch
len = 16nm
Circuit diagram
Figure 3.11: Tristate inverter symolic representation.
3.3.1.2. Inverter as tunable active resistor : diode-connected load
The diode-connected load acts as an active resistor (shown in Figure 3.12) with an effec-
tive input resistance as given in Equation 3.6.
1
Rin (diode connected load) = (3.6)
2(gm + gds )
Generally, the output transconductance gds is very small as compared to the transcon-
gm
ductance gm . Since the self-gain gds
is pretty high (≈ 15) for FinFET technology, so the
gds term can be ignored and the final input resistance is given by Equation 3.7.
1
Rin (diode connected load) ≈ (3.7)
2gm
This resistor can be tuned by changing the strength of the transistors through series
enable/disable switches. Not only this resistance is tunable, it also acts as a very good
VCC
fingers = 2*m
EN fins = 8
len = 16nm
fingers = m gm*Vin
fins = 4
len = 16nm
m Vin
Vin Vin
gm*Vin
fingers = m
Symbol fins = 4
len = 16nm
Small-signal model
fingers = 2*m
(assuming rds >>0
EN fins = 8 & ideal switch)
len = 16nm
Circuit diagram
Figure 3.12: Tunable active resistor.
biasing circuit. Because of the negative feedback due to shorted gate and drain of the
transistors, the input node gets biased at nearly half of the supply voltage. However, the
bias voltage may be slightly different from the mid-rail supply voltage if the PMOS and
NMOS have different strengths, which is unlikely in the case of FinFETs. This self-bias
feature of the diode-connected load can be exploited to bias the drain node of the inverter
amplifier.
The CMOS inverter (which is a tranconductance amplifier) with a self-biased diode-
connected load works as a voltage amplifier. This combination will be used multiple times
throughout the CTLE design. Another benefit of using this configuration is that the DC
gain (given in Equation 3.8) is PVT insensitive. But these benefits come at the cost of
power. Especially, a small valued active resistor is very power hungry as it requires a
high gm value.
gmdriver
DC Gain (inverter with diode connected load) = (3.8)
gmdiode−connected−load
3.3.1.3. Inverter as tunable active inductor
Inverter cell can be used as an inductor to reduce the area overhead caused by passive
coil-based inductors. The active inductor architecture used in this design is shown in
Figure 3.13. Ignoring the output transconductance term and solving the small-signal
model, the input impedance of the active inductor can be evaluated from Equation 3.9.
VCC
fingers = 2*m
EN
fins = 8
len = 16nm Cgs
Vx
fingers = m gm*Vx
Rfb
fins = 4
len = 16nm
m Rfb Vin
Rfb
Vin Vin
gm*Vx
Rfb Vx
fingers = m
Symbol fins = 4 Cgs
len = 16nm
EN fingers = 2*m Small-signal model

fins = 8 (assuming rds >>0
len = 16nm & ideal switch)
Circuit diagram
Figure 3.13: Tunable active inductor.
1 1 + sRf b Cgs
Zin (s) = ∗ (3.9)
2 gm + sCgs
The Rf b term provides low pass filtering and it creates a zero to give an inductive
boost. But, there also exists a parasitic pole which limits the bandwidth usage of this
inductor. Moreover, with the inclusion of other transistor parasitics, the bandwidth
extension provided by this inductor decreases even further. If we assume that Cgs is
negligible then the input impedance transforms and can be evaluated using Equation
3.11.
1 1 + sRf b Cgs
Zin (s) ≈ ∗ (3.10)
2 gm
1 Rf b
Zin (s) ≈ +s (3.11)
2gm 2ωt
Limitation on using large Rf b value: Looking at this equation, one can argue
that if we keep on increasing the Rf b value, one can attain very large inductor value, but
there is a limitation on that. It can be explained by a case study of a situation where
a CMOS inverter acting as an amplifier drives an active inductor with a capacitive load
CL , as shown in Figure 3.14.
Vout
Vin Vout (1+s*Rfb*Cgs)

CL Z=
gm gm_driver*Vin 2*(gm+s*Cgs)
gm_driver CL
Circuit diagram Small-signal model
Figure 3.14: Inverter amplifier drives active inductor load alongwith capacitive load CL .
Solving the small-signal model of the circuit given in the figure, the transfer function
of the inverter driving an active inductor can be evaluated using Equation 3.12. This
equation has a term dependant on Rf b in the denominator too, which indicates that
having a large value of Rf b can reduce gain at high frequencies. Rf b value can have further
implications on thermal noise [19]. The value of Rf b is a trade-off between noise and
bandwidth requirements and thus, it needs to be optimized under stringent requirements
[19].
gmdriver 1 + sRf b Cgs

H(s) = ∗ s(CL +2Cgs ) s2 Rf b Cgs CL
(3.12)
2gm 1+ +
2gm 2gm
In this work, in addition to tunable transistors, Rf b is also tunable through series tran-
sistor switches, which provides flexibility to tune the inductor value for a given bandwidth
requirement. However, the increased circuitry (due to tunability) has an adverse effect
on the circuit bandwidth due to the additional parasitics.
3.3.2. High-frequency boost stage (HF-CTLE)
This is the first stage of the CTLE and it is responsible for the major high frequency peak-
ing. The architecture consists of a combination of a high-pass filter and inverter based
active circuit for amplification as shown in Figure 3.15. The input signal is amplified in
two different frequency ranges. The high-pass filter path is responsible for amplification
in high frequency range only, whereas the DC coupled path is responsible for DC as well
as high-frequency gain. This is followed by an active inductor which provides bandwidth
extension. All the active elements are tunable due to the inverter’s tristate nature.
IN+ OUT-
Cz_hf = 85 fF Lz_hf = 450 pH m=8
Rz_hf = 65 Ω (gm_loadhf)
m = 30
CM bias
Common-mode (gm_hf)
bias from Rz_hf = 65 Ω
feedback loop
Cz_hf = 85 fF Lz_hf = 450 pH
IN- OUT+
High pass filter m=8

(gm_loadhf)
m=4
(gm_dc)
Figure 3.15: High frequency boost stage.
The front high-pass filter uses a mom-cap which is more linear and has less parasitics
as compared to the MOS-capacitor. A tunable resistance is realized by using series
transistor switches. The switch sizes are big to ensure small ON resistance which adds
parasitics. As a remedy, it is better to have these switches close to the common-mode
node, so that there are less parasitics in the main differential signal path.
A series-peaking inductor is also used to reduce the attenuation due to the parasitic
capacitors at the output of this filter. The impact of this passive filter is very much
visible from the magnitude response of the transfer function of this circuit as shown in
Figure 3.16. Larger Lzhf values provide better gain at high frequency but also have sharp
roll-off in the magnitude response, which can cause under-shoots in the pulse-response
and eventually degrades horizontal margins in the eye diagram. So, an optimum inductor
value of 450 pH is chosen to meet the design requirement.
HF filter AC gain
5
Lz hf = 0 pH
0 Lz hf = 150 pH
Lz hf = 300 pH
-5 Lz hf = 450 pH
Lz hf = 600 pH
-10
|H(f)| dB
-15
-20
-25
-30
-35
10 9 10 10
frequency (in Hz)
Figure 3.16: Large Lzhf value helps to have better filter gain at high frequencies.
Upon ignoring the capacitive load and the inductor effect, the transfer function of this
CTLE stage can be given by Equation 3.13. Passive inductor and parasitic capacitors
are not considered while deriving this equation in order to have better visibility of the
circuit’s poles and zeros. The zero location of this CTLE stage is given by Equation 3.14
and the pole location is given by Equation 3.15.
1 gmdc + s(gmdc + gmhf )Rzhf Czhf

H(s) = ∗ (3.13)
gmloadhf 1 + sRzhf Czhf
1
wz = gmhf (3.14)
(1 + gmdc
)Rzhf Czhf
1
wp = (3.15)
Rzhf Czhf
Looking at these equations, the pole and zero locations, as well as the DC gain of the
high-frequency CTLE stage very much depend on the transconductance of transistors. So,
most of the tunability requirement is achieved by tuning the strength of the transistors.
However, it also changes either the DC or the high-frequency gain along with it. On the
other side, the tunability in the resistor will just change the pole and the zero location
without affecting the gain levels. The input bias of the inverter amplifiers is provided
by the common-mode node of the high-pass filter, which is controlled by a feedback loop
and will be discussed later in this chapter.
3.3.3. Mid-band gain stage
This is the second stage in the CTLE and is responsible for low to mid-frequency range
equalization. Before, diving into its circuit, it is better to explain the need for MF-CTLE
(mid-band CTLE) first.
3.3.3.1. Need for MF-CTLE
Traditionally, there used to be only one high-frequency boost stage in the CTLE. But
now, as the Nyquist frequency of operation is increasing, the high-frequency CTLE stage
only is not sufficient for the low to mid-frequency range equalization [26]. This is largely
due to the channel’s magnitude response not being steep in low to mid frequency range
(due to skin loss) and the fact that the high-frequency equalizer is designed to have a
+20 dB/decade slope. So, there is a need for an equalizer that can match better with
the gentle slope of the channel’s frequency response in the mid frequency region.
MF-CTLE is also called the long-tail ISI equalizer because it eliminates several post-
cursors in the CTLE pulse-response [27] and thus removes the long-tail ISI. It is reported
that to accomplish the same task delivered by MF-CTLE for cancelling the post-cursors, a
multi-dozen-tap DFE/FFE would be required [27]. The inclusion of this block reduces the
complexity for the remaining equalization blocks. In order to have a better understanding
regarding the need for MF-CTLE, a MATLAB based model of CTLE (with and without
mid-band stage) is created for the same LR channel. The model transfer function is
shown in Equation 3.16. The CTLE model parameters are tabulated in Table 3.1.
s s
1+ f z,hf
1+ f z,mf 1
Av (CTLE model) = k · ( s )·( s )· s (3.16)
1+ f p,hf
1+ f p,mf
(1 + f p,parasitics
)3
Table 3.1: CTLE model parameters in MATLAB.
fp,parasitics fz (zero due

fz (zero) fp (pole)
(pole) to inductor)
High-
frequency 3 GHz 28 GHz 42 GHz 28 GHz
boost stage
Mid-
frequency 150 MHz 200 MHz 42 GHz -
boost stage
Buffer stage - - 42 GHz -
• CTLE with mid-band stage has cascade of HF boost stage, MF-boost stage and buffer
stage with overall DC gain = -10 dB.
• CTLE without mid-band stage has cascade of HF boost stage and buffer stage with
overall DC gain = -10 dB.
It is important to ensure that the goal of the equalization is to have a flat magnitude
response till the Nyquist frequency. Figure 3.17 illustrates how a CTLE with mid-band
stage does better equalization till the higher frequency than the one without the mid-
band stage. Generally, the mid-band stage has a very small amount of boost in the
low frequency range, which is achieved by having a circuit with nearby zero and pole
locations, as shown in the CTLE model parameters table. It can be inferred that without
the mid-band CTLE, the equalized channel magnitude response is not completely flat.
This model provides the equalization to a certain extent and it is expected that the
remaining equalization is achieved by using RX-FFE and RX-DFE.
Magnitude response of Channel and CTLE

5
-5
-10
|H(f)| dB
-15
-20
Channel
Channel+CTLE with mid-band stage
-25 Mid-band stage CTLE
Channel+CTLE without mid-band stage
-30
10 8 10 9 10 10
frequency (in Hz)
Figure 3.17: Mid-band CTLE improves equalization in terms of magnitude response.
In order to understand long-tail cancellation effect, there is a need to examine the
CTLE’s pulse response and eventually the eye diagram. The pulse response (for pulse
width corresponding to 56 Gbps NRZ data rate) for the channel and the CTLE (with
and without MF-CTLE) is shown in Figure 3.18.
10
Channel
CTLE with mid-band stage
CTLE without mid-band stage
8
PMR = 4.74
6
PMR = 2.06
PMR = 2.20
-2
6.7 6.8 6.9 7 7.1
Time (in sec) 10 -9
Figure 3.18: Mid-band CTLE has less post-cursor ISI with reference to the main-cursor.
Main cursor corresponds to the main signal while the pre and post cursors are residues
in other bits. Cursors except the main cursor are responsible for ISI and are thus unde-
sirable. MF-CTLE especially provides a better attenuation for the case of post-cursors.
If the main cursor is treated as the main signal and other cursors as ISI, we can charac-
terize the pulse response in terms of Signal to ISI ratio (signal to noise ratio) given by
Equations 3.18 and 3.20.
Power of main-cursor
Signal to ISI ratiopre−cursors = (3.17)
ΣPower of pre-cursors
a2main
Signal to ISI ratiopre−cursors = (3.18)
Σa2pre−cursors
Power of main-cursor
Signal to ISI ratiopost−cursors = (3.19)
ΣPower of post-cursors
a2main
Signal to ISI ratiopost−cursors = (3.20)
Σa2post−cursors
The Signal to post-cursor ISI ratio of the pulse response for ideal n-tap DFE is shown
in Figure 3.19. The n-tap DFE (or ”n” skipped cursors) means that it eliminates first
”n” number of post-cursors. The plot shows that Signal to ISI ratio improves as the
number of taps increase or unwanted cursors decrease. The key point of this plot is to
show that the CTLE with mid-band stage has significantly high Signal to ISI ratio for
post-cursor analysis, however it does not have much impact on the pre-cursors (can be
verified but not shown in the plot). This indicates better long-tail cancellation in the
case of MF-CTLE. This plot be be used to find number of DFE taps required to meet a
target Signal to post-cursor ISI ratio and hence BER. It can be inferred from plot that
with MF-CTLE, we need less number of taps to achieve same BER target.
Finally, the eye-diagram has been created for 50k bits at 56 Gbps NRZ data rate as
shown in Figures 3.20, 3.21, 3.22 and 3.23. In this simulation, the channel input receives
1 Vpp differential signal swing and there is no noise consideration . With MF-CTLE, the
eye diagram has better DC voltage level signals as highlighted in green boxes. Also, the
horizontal and vertical eye opening is better for the center eye as shown in Figures 3.22
and 3.23.
Signal to ISI ratio analysis for post-cursors in the pulse response

35
With midband CTLE
Without midband CTLE
30
Signal to ISI ratio (dB)
25
20
15
0 2 4 6 8 10 12 14 16
Ideal n-tap DFE (number of skipped post-cursors)
Figure 3.19: Signal to post-cursor ISI ratio for CTLE pulse response.
Figure 3.20: Eye diagram at the output of the CTLE without mid-band stage. Green-box
highlights more ISI due to the long-tail in the pulse response.
Figure 3.21: Eye diagram at the output of the CTLE with mid-band stage. Green-box
highlights lesser ISI due to the long-tail in the pulse response.
Figure 3.22: Center eye opening at the output of the CTLE without mid-band stage. (Vertical
eye opening = 26 mVpp and horizontal eye opening = 42 % UI )
Figure 3.23: Center eye opening at the output of the CTLE with mid-band stage. (Vertical
eye opening = 60 mVpp and horizontal eye opening = 57 % UI )
3.3.3.2. MF-CTLE design
The architecture for MF-CTLE is shown in Figure 3.24. It consists of a high pass filter
post the inverter amplifiers, unlike the pre-amplifier filter in the case of HF-CTLE. Similar
to HF-CTLE, MF-CTLE also does amplification in two different frequency regions -
low and mid-to-high frequency regions. The top half as shown in the Figure 3.24 is
responsible for wide-band amplification while the bottom half provides gain only in mid-
to-high frequency range. The capacitor in the filter is tunable through series transistor
switch. The combination of Rdmf and Cz mainly decides the zero location for this CTLE.
There are active inductors in the wide-band gain path for bandwidth enhancement.
IN+ OUT-
m = 11
(Rd_dc)
m = 10
(gm_dc)
IN- OUT+
m = 11
Cz = 400 fF Cz = 400 fF (Rd_dc)
660 Ω
m=2
(Rd_mf)
m=5
(gm_mf)
660 Ω
m=2
(Rd_mf)
Figure 3.24: Mid-band gain stage.
Upon ignoring the capacitive load and the effect of the active inductor, the transfer
function for this gain stage is given in Equation 3.21 and the design parameters are given
Vout
Cz Rd_dc
gm_dc*Vin gm_mf*Vin Rd_mf Ld_dc
Figure 3.25: Small-signal gain circuit for half-circuit of mid-band gain stage.
in Equations 3.22, 3.23 and 3.24.
gmmf
gmdc Rddc [1 + sCz Rdmf (1 + gmdc
)]
H(s) = − (3.21)
1 + sCz (Rdmf + Rddc )
DC gain = gmdc Rddc (3.22)
1
wz = gmmf (3.23)
Cz Rdmf (1 + gmdc
)
1
wp = (3.24)
Cz (Rdmf + Rddc )
3.3.4. Buffer stage
This stage is a wide-band amplifier and does not provide any equalization, rather its
purpose is to drive the 100 fF capacitive load of the CTLE. This stage reduces the
capacitive load for MF-CTLE by providing sufficient isolation from the main load. The
architecture of the buffer stage is shown in Figure 3.26.
IN+ OUT-
45 Ω
135 pH
m = 33
CM_SENSE
(gm_buf) (for feedback loop)
135 pH
45 Ω
IN- OUT+
Figure 3.26: Buffer stage.
The drivers are pretty strong in this stage and require a resistive load of ∼ 40 Ω.
Unlike other stages, this stage does not have any active load as it is unwise to realize
an active resistance of 40 Ω (power hungry circuit). The load also consists of a passive
inductor for bandwidth enhancement. Ignoring the capacitive load and the effect of the
inductor, a simplified form of the transfer function of this stage is given in Equation 3.25.
H(s) = gmbuf RL (3.25)
3.3.5. CMFB (Common Mode Feedback Loop)
The CMRR analysis for this CTLE is shown in Table 3.2. This information signifies
how a low frequency differential and common mode signal will be amplified through each
CTLE stage. In an ideal scenario, it is not desired that the common mode signal is
amplified. A low CMRR value implies that a common-mode noise can have significant
impact on the differential signals.
Table 3.2: CMRR analysis for each stage in the CTLE.
Differential DC Common-mode Differential gain

CM RR = Common-mode gain (in dB)
gain (in dB) DC gain (in dB)
HF stage
(after high 9.3 9.3 0
pass filter)
MF stage -1.5 -1.5 0
Buffer
0.5 23 -22.5
stage
CTLE
(post HPF ∼ 8.3 ∼ 30.8 ∼ -22.5
to end)
In addition, if there is a small DC offset at the input common mode node, then that
offset will amplify and ruin the biasing for the next stages. So, there is a need to regulate
the common-mode voltage. The third stage of the CTLE has degraded CMRR largely as
compared to the other stages. The analysis shows that it is because of the fact that its
common-mode gain is equal to the intrinsic gain of the transistor. This is caused by the
passive load which is not the case with the first two stages. To overcome these issues, a
CMFB loop architecture is implemented and shown in Figure 3.27.
This architecture ensures better common-mode bias (nearly half of the supply voltage)
and PSRR (power supply rejection ratio) upto ∼ 100 MHz supply noise due to finite loop
bandwidth. The whole negative feedback loop consists of five negative polarity inverter
stages, three out of which are part of the CTLE core and the remaining two inverters are
in the feedback path. The open-loop DC gain is ≈ 60 dB with a phase margin of ≈ 80◦ ,
which is sufficient for the stability of the loop. The stability is achieved by using a 4.7
pF compensation capacitor in the feedback path as shown in the Figure 3.27. This loop
has gain cross-over frequency of ≈ 600 MHz.

COMMON-MODE-FEEDBACK AMPLIFIERS
m=4 m=2 m=1
4.7 pF
Forced common-
mode voltage
Sensed common-
IN+ mode voltage
OUT+
FRONT-END
TERMINATION HF-CTLE
BLOCK
MF-CTLE BUFFER Load

capacitor
IN-
OUT-
Figure 3.27: CTLE architecture highlighting common-mode feedback loop.
3.3.6. Complete CTLE architecture
The complete CTLE architecture containing the front-end termination block, HF-CTLE,
MF-CTLE, buffer stage and feedback loop is shown in Figure 3.28.

CMFB
4.7 pF
m=4 m=2 m=1
k = 0.6 m=8
IN+ 180 pH 73 pH m = 11 OUT-
85 fF
50 Ω 450 pH 45 Ω
Cesd = 70 fF
500 pH 65 Ω 135 pH
Cpad = 100 fF m = 30 m = 10 m = 33
k = 0.6 500 pH 135 pH
65 Ω
180 pH 73 pH 50 Ω 85 fF 450 pH 45 Ω
IN- OUT+
Cesd = 70 fF
400 fF 400 fF
Cpad = 100 fF
BUFFER
660 Ω m = 11
Front-end termination m=8
m=2
Chapter 3. Receiver Analog Front-end Design
m=4 m=5
m=2
660 Ω
MF-CTLE
HF-CTLE
Figure 3.28: Complete CTLE core architecture.

48
3.3.7. Output buffer and back-end passive network
For the testability of the CTLE , an output buffer stage is required to carry the CTLE
output signals to the outside world (including the external testing equipments such as
oscilloscope and network analyser) without any degradation. The output stage has to be
co-designed with the oscilloscope’s input impedance (≈ 50 Ω with parasitic capacitance
which limit its bandwidth for characterization). The overall architecture for this stage
is shown in Figure 3.29. The buffer uses a high supply voltage of 1.2V to ensure good
Vds saturation margin while providing large “gm ” (and hence gain) from single stage
amplifier. It consists of a passive level shifter circuit, CML based differential amplifer
as buffer, T-coil, ESD cell and bump pads. Generally the output bumps are located far
away from the output buffer and require transmission-line modelling for the routing. The
CML based architecture is chosen here for better CMRR.
Vdd = 1.2V
65 Ω 65 Ω BACKEND PASSIVE CIRCUIT

k=0.53
300 pH 300 pH
213 pH 63 pH OUT-
50 Ω
Cpad = 100 fF ~ 40 fF
Cesd = 70 fF
Long route
transmission k=0.53
line
213pH 63pH OUT+
50 Ω
Cpad = 100 fF ~ 40 fF
Cesd = 70 fF
1.2 pF m = 64 m = 64
IN+
fins = 8 fins = 8
l = 16 nm l = 16 nm
OSCILLOSCOPE
513 kΩ
VCM_BIAS
MODEL
EXT_BIAS_CURRENT
513 kΩ
IN- m = 160 m=2
fins = 8 fins = 8
len = 36 nm len =36 nm
1.2 pF
LEVEL
SHIFTER
Figure 3.29: CML based output buffer to drive the back-end passive network and the outside
world.
Level Shifter: The common-mode voltage of the CTLE output is ≈ 0.4V , i.e. half
of the supply voltage. This input bias voltage is neither suitable for the NMOS nor the
PMOS based input transistors for meeting the biasing, lineariy and gain requirements.
So, a common mode voltage of 600 mV is provided through this passive high pass filter.
The cut-off frequency of the filter is ≈ 250 KHz, which is low enough to pass all the high
frequency signals required for the high data rate operation. The least frequency present
in the input data stream is defined by 8b/10b or 64b/66b encoding for 56 Gbps symbol
rate. However, the parasitics of this capacitor can significantly impact the bandwidth of
the prior CTLE. So, it is realized using a mom-cap as it provides less parasitic capacitance
with the substrate. The output buffer runs at a 1.2 V power supply which is different
from that of the CTLE.
CML buffer: This block is responsible for driving the off-chip load. The CML
based design approach has been adopted instead of the inverter based amplifier for buffer
action. The CML amplifier has inherently better CMRR which provides good immunity
against any common mode noise including the supply/ground noise. This is not the
case for the inverter based amplifiers. Also, the power consumption is not a significant
parameter for this block, as in real scenario, there is no output buffer, but rather an
ADC after the CTLE stage in a typical SERDES system. CML based buffer introduces
less non-idealities to the main differential signals.
Regarding the design of this differential amplifier, the tail current-source transistors
have larger length to ensure sufficient high value of rds. This helps in better CMRR
as well as more accurate mirroring of the current source from the diode-connected load.
The load resistance is chosen to be ≈ 65 Ω, so that the effective output impedance of
this amplifier matches with ≈ 50 Ω characteristic impedance of the on-chip transmission
line to the output pins. Additionally, there is shunt-peaking to extend the bandwidth.
The bias current source and the bias voltage of the level-shifter is controlled externally.
Backend passive network: It consists of ESD cells, bump pads and T-coils. This
T-coil is again asymmetric is nature and serves the same purpose like the front-end T-coil.
Finally, there is an oscilloscope probe at the output for the off-chip measurements.
3.3.8. Top-level layout
The layout for the top-level design including the CTLE and the output buffer stage with
back-end termination is shown in Figure 3.30. The CTLE design (including the front-end
termination block) takes an area of ≈ 140 µm x 140 µm. The output buffer stage along
with the back-end termination takes an area of ≈ 90 µm x 140 µm. The majority of
the area is consumed by the T-coils, passive inductors, ESD cells and the compensation
capacitor in the feedback loop. The implementation of the active inductors in MF-CTLE
and HF-CTLE allows for miniaturization in area. There are certain things including
metal-filling, instantiating power clamps, ESD cells, and certain low speed tunability
circuit, which remain pending due to time-constraints.

Front-end
termination
block
HF-CTLE
MF-CTLE
Buffer
Output
CMFB buffer
Back-end
termination
Figure 3.30: Top-level layout.

ECHNOLOGIES CO., LTD. Huawei Confidential 1
4 Extracted Simulation Results
This chapter presents the extracted layout results of the analog front end of the receiver
consisting of the front-end termination block, HF-CTLE, MF-CTLE, buffer stage and
the output buffer stage. The CTLE system was designed in 16nm FinFET technology
and tested for certain PVT conditions. Most of the results shown here are based on the
typical process corner at 80o C temperature and 0.8V power supply voltage for the CTLE
core, unless explicitly specified.
4.1. Magnitude response
Figure 4.1 shows the magnitude response of the unequalized channel, different stages of
the CTLE and the overall equalized channel. The CTLE has ∼ -9 dB DC gain and a
peak gain of ∼ 8 dB at the Nyquist frequency. The dashed plots show the breakdown of
inner stages of the CTLE. The CTLE achieves peak gain at ∼ 25 GHz.
Figure 4.2 shows the magnitude response of CTLE for linear frequency scale. It
provides better insight about the system in high frequency region of interest.
53
Chapter 4. Extracted Simulation Results 54
Magnitude Response
10

5
-5
|H(f)| (dB)
-10
HF-CTLE gain
-15 MF-CTLE gain
Buffer gain
Output buffer gain
CTLE gain
-20
AC response at channel output
Equalized AC response at CTLE output
-25
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 4.1: CTLE magnitude response.
Magnitude Response
10
-5
|H(f)| (dB)
-10
HF-CTLE gain
-15 MF-CTLE gain
Buffer gain
Output buffer gain
CTLE gain
-20 AC response at channel output
-25
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
frequency (Hz) 10 10
Figure 4.2: CTLE magnitude response for linear frequency scale.

4.2. Pulse response
Pulse Response
0.14
Channel output
HF-CTLE output
0.12
MF-CTLE output
Buffer output
0.1 Final CTLE output
0.08
Voltage (V)
0.06
0.04
0.02
-0.02
7.15 7.2 7.25 7.3 7.35
time (sec) 10 -9
Figure 4.3: CTLE Pulse response.
Figure 4.3 shows the pulse response of the unequalized channel, different stages of the
CTLE and the overall equalized channel. The equalized channel response has significantly
less post cursor ISI, although due to steep roll off in magnitude response, some undesirable
undershoots (∼ 12 %) degrade the horizontal eye opening margins.
To overcome the undershoot issue, I have added the tunability options in active
inductors in the circuit, however it will adversely impact the high frequency gain. It
can be noticed that MF-CTLE has reduced post cursors relative to main the cursor as
compared to HF-CTLE. This helps in the long-tail ISI cancellation and its impact on the
eye-diagram is presented in the Appendix.

4.3. Thermal noise
Figure 4.4 shows the output and input referred noise voltage spectral density for the ana-
log front end (including front termination, HF-CTLE, MF-CTLE and the buffer stage).
The integrated output thermal noise is 2 mVrms . Integrating the input referred noise
voltage spectrum till ∼ 50 GHz (system’s bandwidth) provides integrated input referred
noise voltage of 1.4 mVrms . The thermal noise impacts the SNR and hence impacts BER.
Simulation results yield that the major noise contributors are transistors in HF-CTLE.
Figure 4.5 shows that major noise contribution comes from thermal noise in high
frequency region as area under the curve will be larger in that region.
60
Output noise
Input referred noise
50
Integrated output thermal noise = 2.02 mVrms
40 Integrated input referred thermal noise = 1.37 mVrms
30
20
10
0
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 4.4: Thermal noise spectral density for CTLE.

60
Output noise
Input referred noise
50
Integrated output thermal noise = 2.02 mVrms
40 Integrated input referred thermal noise = 1.37 mVrms
30
20
10
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
frequency (Hz) 10
10
Figure 4.5: Thermal noise spectral density for CTLE for linear frequency scale.
4.4. Eye diagram for NRZ (PAM-2) data signaling
The eye-opening in PAM -2 signal is higher in comparison to PAM-4 signal due to more
margin in symbol voltage levels. In PAM 4 signaling, the the adjacent symbols have
1/3 times the margins in comparison to PAM-2 signaling. So, the eye-diagrams for both
PAM-2 and PAM-4 data modulations are presented in this work to have deeper insight.
The eye diagrams are generated for 10k PRBS data symbols pattern from TX end with
an output signal swing of 1 Vpp differential. The eye at the channel output is fully closed
even at a lower data rate (40 Gbps NRZ) as shown in Figure 4.6
The eye-diagram at the final CTLE output for 56, 50 and 40 Gbps NRZ data signaling
is shown in Figures 4.7, 4.8 and 4.9 respectively. The eye has a small vertical opening of ≈
15 mVpp with a horizontal opening of 33 % UI for 56 Gbps data rate but it gets improved
Figure 4.6: Eye diagram at the channel output for 40 Gbps NRZ data rate (Pulse PMR =
3.6).
for 50 Gbps data rate (with a vertical eye opening of ≈ 75 mVpp and a horizontal eye
opening of ≈ 58 % UI) because this CTLE provides maximum high frequency gain till
25 GHz as shown in Figure 4.1. The eye is wide open for lower data rates because of the
lower attenuation offered by the channel at lower frequencies.

Figure 4.7: Eye diagram at the CTLE output for 56 Gbps NRZ data rate (Pulse PMR =
3.14).
Figure 4.8: Eye diagram at the CTLE output for 50 Gbps NRZ data rate (Pulse PMR=2.85).
Figure 4.9: Eye diagram at the CTLE output for 40 Gbps NRZ data rate (Pulse PMR=2.4).
4.5. Eye diagram for PAM-4 data signaling
This section shows eye diagrams at the CTLE output for 20k PRBS PAM-4 data symbols
with an output signal swing of 1 Vpp differential. The eye for 112 Gbps PAM-4 data rate
is completely closed as shown in Figure 4.10. However, there is certain eye opening for
lower data rates (64 and 40 Gbps PAM-4) as shown in Figures 4.11 and 4.12. The eye-
diagram for 40 Gbps data rate shows asymmetric eye opening for middle and edge eyes
because the equalizer needs to be optimized for this data rate.
Figure 4.10: Eye diagram at the CTLE output for 112 Gbps PAM-4 data rate (Pulse
PMR=3.14).
Figure 4.11: Eye diagram at the CTLE output for 64 Gbps PAM-4 data rate (Pulse
PMR=2.18).
Figure 4.12: Eye diagram at the CTLE output for 40 Gbps PAM-4 data rate for default
settings (Pulse PMR=2.05). Equalizer needs to be optimized for this data rate.
4.6. Tunability to track PVT variations
The programmability been provided in all stages of the CTLE to track the PVT varaitaions
and to provide a single solution for different channel profiles. Figure 4.13 shows the fre-
quency magnitude response of the CTLE for different extreme tunability options. It may
be difficult to comprehend information about the role of each tunability knob by looking
at this plot. Impact of each tunability option is demonstrated in the Appendix.
Figure 4.13: Magnitude response coverage due to overall sweep of all tunability knobs (high-
lighted red one is the default setting)
The system has been tested across extreme PVT conditions as mentioned in table 4.1.
Figures 4.14- 4.18 demonstrate the PVT tracking ability of the CTLE for chosen PVT
conditions. The default tunability setting (setting for nominal process) does not provide
optimum performance across different PVT conditions. Tunability options provided in
design are able to match the nominal corner performance across different PVT conditions
upto ∼ 10-15 GHz frequency. Beyond this frequency, the circuit bandwidth is limited
by the process and may require more transistor fingers in the case of SS process. This
provision can lead to lot of parasitic capacitance and hence not attempted.
Table 4.1: PVT corners list with details.
Supply voltage Temperature

Process
(Volts) (o C)
Nominal corner typical 0.8 80
Strong corner FF (fast-fast) 0.84 0
Weak corner SS (slow-slow) 0.76 125
SF corner SF (slow-fast) 0.8 80
FS corner FS (fast-slow) 0.8 80
Figure 4.14 shows the CTLE magnitude response for the PVT optimized and un-
optimized case. In un-optimized case, the default tunability setting (setting for the
nominal process) is used for the other PVT corners. However, for optmally tuned con-
figuration for each PVT corner, the magnitude response matches the desired response
(nominal process response) till ∼ 10 GHz frequency. Beyond this frequency, the circuit
bandwidth is limited by the process.
In SF and FS corners, the model file includes slow-fast variations for transistors
whereas the passive components behaviour is same as the typical process. As shown
in Figures 4.14, 4.15, 4.18 and 4.8, the SF and FS corners show similar results as the
nominal corner. This signifies that the biasing of the circuit is proper even for skewed
strength of the NMOS and PMOS.
Figure 4.15 shows the CTLE pulse response for the PVT optimized and un-optimized
case. The long-tail ISI is reduced due to optimal tunability setting in the Weak corner.
In strong corner, the amount of ringing is reduced due to PVT optimization.
Figures 4.16 and 4.17 show improvement in the eye opening due to optimized circuit
settings for strong and weak corners respectively. Figures 4.18 shows the eye-diagrams
for SF and FS corners which are similar to the one in the nominal corner as shown in
Figure 4.8.
CTLE magnitude response for unoptimized circuit

10
Nominal corner
8 Strong corner
Weak corner
6 SF corner
FS corner
4
2
|H(f)| (dB)
-2
-4
-6
-8
-10
10 7 10 8 10 9 10 10
frequency (Hz)
(a)
CTLE magnitude response for optimally tuned circuit

10
Nominal corner
8 Strong corner
Weak corner
6 SF corner
FS corner
4
2
|H(f)| (dB)
-2
-4
-6
-8
-10
10 7 10 8 10 9 10 10
frequency (Hz)
(b)
Figure 4.14: CTLE magnitude response across PVT corners for: (a) un-optimized circuit (b)
optimally-tuned circuit.
CTLE pulse response for unoptimized circuit

0.14
Nominal corner
Strong corner
0.12
Weak corner
SF corner
0.1 FS corner
0.08
Voltage (V)
0.06
0.04
0.02
-0.02
7.15 7.2 7.25 7.3
time (sec) 10 -9
(a)
CTLE pulse response for optimally tuned circuit

0.14
Nominal corner
Strong corner
0.12
Weak corner
SF corner
0.1 FS corner
0.08
Voltage (V)
0.06
0.04
0.02
-0.02
7.15 7.2 7.25 7.3
time (sec) 10 -9
(b)
Figure 4.15: CTLE pulse response across PVT corners for: (a) un-optimized circuit (b)
optimally-tuned circuit.
(a)
(b)
Figure 4.16: Eye diagram at the CTLE output for Strong corner for 50 Gbps NRZ data rate
for: (a) un-optimized circuit (Pulse PMR=2.7) (b) optimally-tuned circuit (Pulse PMR=2.82).
(a)
(b)
Figure 4.17: Eye diagram at the CTLE output for Weak corner for 50 Gbps NRZ data rate for:
(a) un-optimized circuit (Pulse PMR=3.21) (b) optimally-tuned circuit (Pulse PMR=2.97).
(a)
(b)
Figure 4.18: Eye diagram at the CTLE output for 50 Gbps NRZ data rate for: (a) SF corner
(Pulse PMR=2.75) (b) FS (Pulse PMR=3) corner.
4.7. Common-mode bias stability
The feedback loop in the CTLE architecture ensures that the common-mode voltage at
each node is biased to ∼ half of the supply voltage for FinFET process. This loop is
stable with a minimum phase margin of ≈ 65o and a gain margin of ∼ 25 dB across PVT.
The transient common-mode voltage at each CTLE stage for 112 Gbps PAM-4 PRBS
data pattern is shown in Figure 4.19. The variation in the common mode voltage at
HF-CTLE and MF-CTLE stages is very small. The buffer-stage output common-mode
voltage is also stable but wiggles by a certain tolerable amount.
414.0
412.0
410.0
408.0
406.0
404.0
(mV)
402.0
400.0
398.0
396.0
394.0
392.0
390.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5
time (us)
• HF-CTLE output common mode voltage
• MF-CTLE output common mode voltage
• Buffer stage output common mode voltage
Figure 4.19: Common mode voltage at each stage of CTLE for 112 Gbps PAM-4 PRBS data
transitions
4.8. Common-mode frequency response and PSRR
Figure 4.20 shows common-mode frequency response for the analog front end. The plot
shows that the closed loop system is able to reject common-mode noise till a frequency of
∼ 1 MHz. Figure 4.21 shows the impact of supply noise on the common-mode output of
each CTLE stage. The closed loop system is able to reject supply noise till ∼ 100 MHz.
Figure 4.22 shows the impact of supply noise on the differential output of each CTLE
stage. For the case mismatch is not considered, the system rejects supply noise to large
extent even in GHz frequency range.
Common-mode frequency response

30
20
10
-10
|H| (dB)
-20
-30
-40
Channel output CM
-50 CTLE input CM
HF-CTLE output CM
-60 MF-CTLE output CM
Buffer stage output CM
-70
10 2 10 4 10 6 10 8 10 10
frequency (Hz)
Figure 4.20: Common-mode frequency response of the analog front-end for 0 dB input
common-mode AC signal at channel output.
PSRR for output common-mode voltage

20
10
|H| (dB) 0
-10
-20
-30
HF-CTLE output CM
MF-CTLE output CM
Buffer stage output CM
-40
10 6 10 7 10 8 10 9 10 10
frequency (Hz)
Figure 4.21: Transfer function of supply noise from supply to output common-mode voltage
at each CTLE stage
PSRR for output differential voltage
-10
-20
-30
-40
|H| (dB)
-50
-60
-70 HF-CTLE output Vdiff.

MF-CTLE output Vdiff.
Buffer stage output Vdiff.
-80
10 6 10 7 10 8 10 9 10 10
frequency (Hz)
Figure 4.22: Transfer function of supply noise from supply to output differntial voltage at
each CTLE stage
4.9. Thermal noise effect on the eye diagram
Figure 4.23 shows eye diagram at CTLE output for 50 Gbps NRZ data rate for 2k PRBS
(PN-32) data symbols. Figure 4.24 shows very small degradation in the eye when thermal
noise is enabled during transient simulation. However, the AC noise simulation prediction
of 2 mV integrated output thermal noise seems bit pessimistic.
Figure 4.23: CTLE eye diagram for 50 Gbps NRZ data rate (without thermal noise)
Figure 4.24: CTLE eye diagram for 50 Gbps NRZ data rate (with thermal noise)
4.10. Power breakdown
The inverter based CTLE consumes a total of 10 mW power. Figure 4.25 shows the
percentage of power consumption for each block in the CTLE.
HF-CTLE
40.2%
26.6% 3.4%
CMFB
MF-CTLE
29.8%
BUFFER
Figure 4.25: CTLE power breakdown
4.11. Comparison of inverter based CTLE with other recent works
Table 4.2 shows comparison between the work in this project with other state of the art
CTLEs published in 16nm FinFET technology. Due to lack of information about the
circuit specifications (especially load capacitance and signal swings) in the publications,
it is challenging to do a fair comparison with other state of the art designs.
The CTLE presented in this work provides an overall gain of 17 dB which is ∼ 40 %
higher than the design presented in [19]. The capacitive load considered in this design
is 100 fF which is ∼ 3.2 times the load considered in [19]. The power consumed by this
CTLE is 10mW which is higher than that in [19], but is lower than the design presented
in [34].
The power consumption of a CTLE depends a lot on the nyquist frequency of opera-
P ower
tion and the load capacitance. So, to make a fair comparison, a parameter F requency∗Load−cap
Table 4.2: Comparison table for inverter based CTLE
This work [19] [28] [29] [30]

16 nm 16 nm 16 nm 16 nm 16 nm
Technology
FinFET FinFET FinFET FinFET FinFET
1.2 V +
Supply volatge 0.8 V Ground 1.2 V 1.2 V 1.2 V
LDO
CTLE type Inverter Inverter Inverter CML CML
Modulation PAM-4 PAM-2 PAM-4 PAM-4 PAM-2
Nyquist
28 GHz 28 GHz 14 GHz 14 GHz 28 GHz
frequency
DC gain /
Nyquist gain
-9 dB /
8dB
-6 dB / 6
dB
- 0 dB / 7
dB
-6 dB / 6
dB
Channel loss at ∗ ∗ ∗
30 dB 8 dB 35 dB 31 dB 8 dB
Nyquist
Load capacitance 100 fF 30 fF - - -
Power 10 mW 6 mW 34 mW 8.4 mW 6 mW
P ower
F requency∗Load−cap 3.57 V 2 7.14 V 2 - - -
90 µm x 80 20 µm x 15 50 µm x 85 125 µm x 80 µm x 50
Area
µm µm µm 40 µm µm
* CTLE provides fraction of total equalization requirement
− Information not reported
is used as figure of merit and its lower value implies more power efficient architecture.
On the basis of this parameter, this CTLE outperforms the design in [19] in terms of
power efficiency.
Moreover, this CTLE uses a high pass filter prior to CMOS amplifier in HF-CTLE
stage, which attenuates low-frequency signals and thus reduces signal swing before am-
plification. But the architecture in [19] uses a high pass filter after amplifier, so my
understanding is that it increases signal swing and makes the architecture in [19] diffi-
cult for PAM-4 applications or to meet linearity and so, a higher supply voltage of 1.2
V is used. The same author of [19] proposes a separate architecture for PAM-4 appli-
cation [28] to meet linearity requirements, but with a different load (not quantified in
fF). The power consumption increases a lot in [28] because of power hungry architecture.
This work (our AFE) works for 1 Vpp differential input signal swing while [19] works for
600 mVpp differential input signal swing. However, linearity is not quantified in the state
of the art designs, so it is not possible to do fair comparison.
[29] and [30] use CML based CTLE architecture and make use of passive inductors
and thus consume larger area as compared to [19] and [28]. But their load capacitance
is not known, so it is again difficult to compare.
This design is able to operate at a low supply voltage of 0.8 V, driving a load capac-
itance of 100 fF with PVT tracking ability.
This work uses least voltage supply and very low power consumption for a load
capacitance of 100 fF. Moreover, this design provides relatively high AC gain. The CTLE
core in this work does not have compact layout, uses 4 passive inductors and also the
area of CMFB with large compensation capacitor is included in the mentioned numbers.
In order to drive larger capacitive load of 100 fF, the design in this work becomes very
challenging to meet bandwidth requirements. The mentioned numbers in this work will
improve a lot for a smaller load capacitor as in [19].
4.12. Conventional CTLE design results
A CML based conventional CTLE is also designed to drive the same capacitive load in
order to compare its power consumption against the inverter based CTLE. The CTLE
architecture (shown in Figure 4.26) consists of three stages similar to the inverter based
CTLE. The results shown for conventional CTLE are based on layout level extracted
netlist at block level. The individual layout blocks need to be connected at top-level
yet. The CTLE design here uses a higher supply voltage of 1.05V to ensure adequate Vds
saturation margin for transistors to keep in saturation.
The magnitude response of the conventional CTLE is shown in Figure 4.27. The
response is similar to the inverter based CTLE except that the bandwidth of the con-
Vdd = 1.05 V Vdd = 1.05 V Vdd = 1.05 V

400 pH 400 pH 300 pH 300 pH 135 pH 135 pH
130 Ω 130 Ω 94 Ω 94 Ω 40 Ω 40 Ω
OUT- OUT+
IN+ m = 40 m = 40 IN- m = 28 m = 28 m = 64 m = 64
fins = 4 fins = 4 fins = 4 fins = 4 fins = 4 fins = 4
len = 16nm len = 16nm len = 16nm len = 16nm len = 16nm len = 16nm
1 KΩ 60 Ω
250 uA
100 fF 6 pF
m = 64
fins = 8
m=2 m = 20 m = 20 m = 14 m = 14
len = 36nm
fins = 8 fins = 8 fins = 8 fins = 8 fins = 8
len = 36nm len = 36nm len = 36nm len = 36nm len = 36nm
HF-CTLE MF-CTLE BUFFER
Figure 4.26: Conventional CTLE architecture.
ventional CTLE is better than the inverter based CTLE, primarily due to the use of
passive inductors. Moreover, this CTLE does not have any tunability circuitry which
can degrade the circuit bandwidth significantly.
The pulse response of this CTLE is shown in Figure 4.29. The eye diagrams for
64 and 40 Gbps PAM-4 data rates are shown in Figures 4.32 and 4.33 respectively.
These eye-diagrams are better than the inverter based CTLE primarily because of more
bandwidth. Figure 4.30 shows the conventional CTLE has less thermal noise than inverter
based CTLE. Figure 4.31 shows total harmonic distortion for both CTLEs when pure
sinusoidal differential signal is provided at the input of the analog front-end. The THD
is calculated for input signal frequency of 1 MHz because the CTLE will suffer maximum
non-linearity at low frequency (because channel does not attenuate low frequency signals).
Comparison between the inverter based CTLE and the conventional CTLE:
Table 4.3 provides comparison between both the CTLEs. To summarise, the inverter
Magnitude Response of the conventional CTLE

10
-5
|H(f)| (dB)
-10
HF-CTLE gain
-15 MF-CTLE gain
Buffer gain
Output buffer gain
-20 CTLE gain
-25
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 4.27: Magnitude response of the conventional CTLE.
Magnitude Response of the conventional CTLE

10
-5
|H(f)| (dB)
-10
HF-CTLE gain
MF-CTLE gain
Buffer gain
-15
Output buffer gain
CTLE gain
-20
-25
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
frequency (Hz) 10 10
Figure 4.28: Magnitude response of the conventional CTLE for linear frequency scale.
Pulse Response of the conventional CTLE

0.14
Channel output
HF-CTLE output
0.12
MF-CTLE output
Buffer output
0.1 Final CTLE output
0.08
Voltage (V)
0.06
0.04
0.02
-0.02
7.25 7.3 7.35 7.4 7.45
time (sec) 10 -9
Figure 4.29: Pulse response of the conventional CTLE.
35
Output noise - Inverter based CTLE
Input referred noise - Inverter based CTLE
30 Output noise - Conventional CTLE
Input referred noise - Conventional CTLE
25 Total output thermal noise (Inv. CTLE) = 2.02 mVrms
Total input referred thermal noise (Inv. CTLE) = 1.37 mVrms

20
Total output thermal noise (Conv. CTLE) = 1.64 mVrms
Total input referred thermal noise (Conv. CTLE) = 0.53 mVrms

15
10
0
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 4.30: Thermal noise spectral density

Total harmonic distortion

18
Inv. based HF-CTLE output
16 Inv. based MF-CTLE output
Inv. based Buffer stage output
Conventional HF-CTLE output
14 Conventional MF-CTLE output
Conventional Buffer stage output
12
THD (%age)
10
0
0 500 1000 1500
Input signal amplitude (Vpp differential)
Figure 4.31: Total harmonic distortion
Figure 4.32: Eye diagram at the output of the conventional CTLE for 64 Gbps PAM-4 data
rate.
Figure 4.33: Eye diagram at the output of the conventional CTLE for 40 Gbps PAM-4 data
rate.
based CTLE is better in terms of supply voltage, power consumption, area, PVT insen-
sitivity and tunability. However, the conventional CTLE offers better bandwidth and
linearity. The bandwidth of the inverter based CTLE can improve a lot if the load ca-
pacitance can be decreased or if the extent of the tunability can be reduced so that there
are less additional parasitics.

Table 4.3: Comparison between the inverter based CTLE and the conventional CTLE
Inverter based CTLE Conventional CTLE

Supply voltage 0.8 V 1.05 V
DC gain /
- 9 dB / 8 dB -8 dB / 9 dB
Nyquist gain
Power 10 mW 17 mW
Area Less (4 passive inductors) More (6 passive inductors)
less (gain levels of HF-CTLE and
PVT MF-CTLE are PVT invariant.
more
sensitivity Tunability can track remaining
PVT variations)
power-scalable, mainly due to
Tunability may have some tunability
tristate inverters
better (as inferred from PAM-4
less (as inferred from PAM-4
Linearity eye-diagrams because it uses higher
eye-diagrams)
supply voltage too)
less (Active inductors are difficult
to realize at high data rate
especially when driving high
Bandwidth more (because of passive inductors)
capacitive load. Also additional
parasitics due to tunability degrade
the bandwidth a lot)
Integrated
output 2 mV 1.4 mV
thermal noise
5 Conclusion
5.1. Summary
This thesis presents the design of a low power analog front end for 112 Gbps PAM-4 data
rate for LR SERDES application. It is implemented in 16nm CMOS FinFET technology
and the results shown are based on layout level extracted netlist. This design uses
passive equalization techniques in the front-termination network to overcome the signal
attenuation caused by the channel, ESD cells, bump pads and other parasitics. The
CTLE design exploits the power advantage in using a CMOS inverter based amplifier
instead of the conventional CML based differential amplifier. Moving forward with the
choice of a CMOS inverter as a basic design element, this CTLE provides 17 dB of
equalization, consuming 10 mW power with a 0.8 V supply voltage, which is significantly
smaller than the previous state of the art CTLE designs. Inverter based active inductor is
used instead of coil based passive inductor in some stages to reduce the area. A common-
mode feedback loop ensures robust biasing (low-power solution too) for the amplifiers.
The tunability offered in this design enables power-scalable design and also helps to track
the PVT variations.
83
Chapter 5. Conclusion 84
5.2. Future work
This design will be fabricated in 16nm FinFET technology to test its performance. The
CTLE designed in this work is a part of the SERDES transceiver chain. To fully equalize
the channel and observe the improvement in the overall system efficiency, there is a need
to include the remaining equalization blocks, which is possible by having an ADC and
DSP after the CTLE.
Another inverter based CTLE architecture can be explored where MF-CTLE stage
acts as a parallel path instead of a cascade after HF-CTLE, to see if that provides better
bandwidth and power efficiency. My perception is that it will have higher loop-bandwidth
for the CMFB, which is better for PSRR. At one time during the design of this CTLE,
there was an attempt to use variac capacitors to tune zero location of the CTLE, but
it could not be used because of discrepancy in the model file. I believe use of variac
capacitors can be a better choice than capacitor with series transistor switch to tune the
circuit, as it does not include the switch parasitics.
Driving a large capacitance of 100 fF limits the performance of the CTLE and poses
several design challenges in this work. If the input load capacitance of the following
ADC block can be reduced, it is possible to significantly improve the performance (can
eliminate the need for some passive inductors, better active inductors implementation
and more power-efficient design with better bandwidth) of the proposed CTLE design,
without the need to look for other CTLE architectures.

Appendix
Tunability information in detail
CMFB
m=8
trim_bias_hf m = 11 OUT-
IN+
trim_res_hf trim_bias_buf1
65 Ω m = 30 m = 10 trim_res_buf1
m = 33
65 Ω trim_hf trim_buf1 trim_buf
IN- trim_mfcap
trim_hf_filter OUT+
400 fF 400 fF
Front-end BUFFER
m = 11
termination
m=8 trim_bias_buf1
m=2
m=4 trim_bias_hf m=5 trim_res_buf1
trim_bias_mf
trim_dc trim_res_hf trim_mf
MF-CTLE m=2
HF-CTLE trim_bias_mf
Figure 5.1: CTLE architecture with tunability knobs information
85
Table 5.1: Tunability knobs information
No of bits Comments
T rimhf −f ilter 5 controls zero location of HF-CTLE
T rimhf 4 controls gm of the amplifier
T rimbias−hf 3 controls active load resistance
T rimres−hf 3 tunabile active inductance
T rimdc 3 controls gm of the amplifier
T rimbuf 1 3 controls gm of the amplifier
T rimbias−buf 1 3 controls active load resistance
T rimres−buf 1 3 tunabile active inductance
T rimmf 3 controls gm of the amplifier
T rimbias−mf 5 primarily controls zero location of MF-CTLE
T rimmf −cap 3 controls zero location of MF-CTLE
T rimbuf 4 controls gm of the amplifier
Trim hf-filter effect on CTLE AC response

10
2
|H(f)| (dB)
-2
-4
-6
-8
-10
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.2: T rimhf −f ilter effect on the CTLE transfer function.

Trim hf
effect on CTLE AC response
10
m =3
8 m =5
m =7
m =9
6
m =11
m =13
4 m =15
m =17
2 m =19
|H(f)| (dB)
m =21
m =23
0 m =25
m =27
-2 m =29
m =31
m =33
-4
-6
-8
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.3: T rimhf effect on the CTLE transfer function.
Trim bias-hf effect on CTLE AC response

10
m =4
8 m =5
m =6
6 m =7
m =8
4 m =9
m =10
2 m =11
|H(f)| (dB)
-2
-4
-6
-8
-10
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.4: T rimbias−hf effect on the CTLE transfer function.

Trim res-hf
10
4
|H(f)| (dB)
-2
-4
-6
-8
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.5: T rimres−hf effect on the CTLE transfer function.
Trim dc effect on CTLE AC response

10
m =1
m =2
m =3
5
m =4
m =5
m =6
0 m =7
|H(f)| (dB)
-5
-10
-15
-20
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.6: T rimdc effect on the CTLE transfer function.

Trim buf1 effect on CTLE AC response

10
m =2
m =4
m =6
5 m =8
m =10
m =12
0 m =14
|H(f)| (dB)
-5
-10
-15
-20
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.7: T rimbuf 1 effect on the CTLE transfer function.
Trim bias-buf1
10
m =4
8 m =5
m =6
m =7
6
m =8
m =9
4 m =10
m =11
2
|H(f)| (dB)
-2
-4
-6
-8
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.8: T rimbias−buf 1 effect on the CTLE transfer function.

Trim res-buf1
10
4
|H(f)| (dB)
-2
-4
-6
-8
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.9: T rimres−buf 1 effect on the CTLE transfer function.
Trim mf effect on CTLE AC response

10
m =0
8 m =1
m =2
6 m =3
m =4
4 m =5
m =6
2 m =7
|H(f)| (dB)
-2
-4
-6
-8
-10
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.10: T rimmf effect on the CTLE transfer function.

Trim bias-mf
10
2
|H(f)| (dB)
-2
-4
-6
-8
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.11: T rimbias−mf effect on the CTLE transfer function.
Trim mf-cap
10
Cz =200 fF
8 Cz =400 fF
2
|H(f)| (dB)
-2
-4
-6
-8
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.12: T rimmf cap effect on the CTLE transfer function.

Trim buf effect on CTLE AC response

10
m =3
m =5
5 m =7
m =9
m =11
0 m =13
m =15
m =17
-5 m =19
|H(f)| (dB)
m =21
m =23
-10 m =25
m =27
m =29
-15 m =31
m =33
-20
-25
10 7 10 8 10 9 10 10
frequency (Hz)
Figure 5.13: T rimbuf effect on the CTLE transfer function.

Impact of MF-CTLE on the eye-diagram
Figure 5.14: Eye diagram at the output of HF-CTLE (inverter based) for 40 Gbps PAM-4
data rate.
Figure 5.15: Eye diagram at the output of MF-CTLE (inverter based) for 40 Gbps PAM-4
data rate.
Figure 5.16: Eye diagram at the output of complete CTLE (inverter based) for 40 Gbps
PAM-4 data rate.
References
[1] Cisco. Cisco annual internet report (2018–2023). White Paper, 2020.
[2] Anne Holst. Forecast global data center ip traffic 2013-2021. Statistica, 2020.
[3] N Jones. How to stop data centres from gobbling up the world’s electricity. Nature,
2018.
[4] OIF. 112 gbps electrical interfaces, an oif update on cei-112g. Panel Session, 2020.
[5] D. C. Daly, L. C. Fujino, and K. C. Smith. Through the looking glass-2020 edi-
tion: Trends in solid-state circuits from isscc. IEEE Solid-State Circuits Magazine,
12(1):8–24, 2020.
[6] B. Min and S. Palermo. A 20gb/s triple-mode (pam-2, pam-4, and duobinary)
transmitter. In 2011 IEEE 54th International Midwest Symposium on Circuits and
Systems (MWSCAS), pages 1–4, 2011.
[7] Intel. http://www.ieee802.org/3/ck/public/tools/index.html. Backplane model file,
2020.
[8] Y. Krupnik, Y. Perelman, I. Levin, Y. Sanhedrai, R. Eitan, A. Khairi, Y. Shifman,
Y. Landau, U. Virobnik, N. Dolev, A. Meisler, and A. Cohen. 112-gb/s pam4 adc-
based serdes receiver with resonant afe for long-reach channels. IEEE Journal of
Solid-State Circuits, 55(4):1077–1085, 2020.
95
References 96
[9] H. Wu, M. Shimanouchi, and M. PengLi. Effective link equalizations for serial links
at 112 gbps and beyond. In 2018 IEEE 27th Conference on Electrical Performance
of Electronic Packaging and Systems (EPEPS), pages 25–27, 2018.
[10] Kevin Zheng. System driven circuit design for adc based wireline data links. PhD
Thesis, 2018.
[11] D. Cui, H. Zhang, N. Huang, A. Nazemi, B. Catli, H. G. Rhew, B. Zhang, A. Mom-
taz, and J. Cao. 3.2 a 320mw 32gb/s 8b adc-based pam-4 analog front-end with
programmable gain control and analog peaking in 28nm cmos. In 2016 IEEE Inter-
national Solid-State Circuits Conference (ISSCC), pages 58–59, 2016.
[12] Yohan Frans. Adc-based wireline transceiver. IEEE Custom Integrated Circuits
Conference (CICC), 2019.
[13] P. Upadhyaya, A. Bekele, D. T. Melek, Haibing Zhao, J. Im, Junho Cho, Kee Hian
Tan, S. McLeod, S. Chen, Wenfeng Zhang, Y. Frans, and K. Chang. A fully-adaptive
wideband 0.5–32.75gb/s fpga transceiver in 16nm finfet cmos technology. In 2016
IEEE Symposium on VLSI Circuits (VLSI-Circuits), pages 1–2, 2016.
[14] L. Sun, Q. Pan, K. Wang, and C. P. Yue. A 26–28-gb/s full-rate clock and data
recovery circuit with embedded equalizer in 65-nm cmos. IEEE Transactions on
Circuits and Systems I: Regular Papers, 61(7):2139–2149, 2014.
[15] E. Depaoli, H. Zhang, M. Mazzini, W. Audoglio, A. A. Rossi, G. Albasini, M. Poz-
zoni, S. Erba, E. Temporiti, and A. Mazzanti. A 64 gb/s low-power transceiver for
short-reach pam-4 electrical links in 28-nm fdsoi cmos. IEEE Journal of Solid-State
Circuits, 54(1):6–17, 2019.
[16] J. E. Proesel and T. O. Dickson. A 20-gb/s, 0.66-pj/bit serial receiver with 2-stage
continuous-time linear equalizer and 1-tap decision feedback equalizer in 45nm soi
References 97
cmos. In 2011 Symposium on VLSI Circuits - Digest of Technical Papers, pages
206–207, 2011.
[17] Q. Pan, Y. Wang, Y. Lu, and C. P. Yue. An 18-gb/s fully integrated optical receiver
with adaptive cascaded equalizer. IEEE Journal of Selected Topics in Quantum
Electronics, 22(6):361–369, 2016.
[18] O. E. Mattia, M. Sawaby, K. Zheng, A. Arbabian, and B. Murmann. A 10 gbps
continuous-time linear equalizer for mm-wave dielectric waveguide communication.
IEEE Solid-State Circuits Letters, pages 1–1, 2020.
[19] K. Zheng, Y. Frans, K. Chang, and B. Murmann. A 56 gb/s 6 mw 300 um2 inverter-
based ctle for short-reach pam2 applications in 16 nm cmos. In 2018 IEEE Custom
Integrated Circuits Conference (CICC), pages 1–4, 2018.
[20] M. Pisati, F. De Bernardinis, P. Pascale, C. Nani, N. Ghittori, E. Pozzati, M. Sosio,
M. Garampazzi, A. Milani, A. Minuti, G. Bollati, F. Giunco, R. G. Massolini, and
G. Cesura. A 243-mw 1.25–56-gb/s continuous range pam-4 42.5-db il adc/dac-based
transceiver in 7-nm finfet. IEEE Journal of Solid-State Circuits, 55(1):6–18, 2020.
[21] Thomas H. Lee. The design of cmos radio-frequency integrated circuits. Textbook.
[22] P. W. de Abreu Farias Neto, K. Hearne, I. Chlis, D. Carey, R. Casey, B. Grif-
fin, F. S. F. Ngankem Ngankem, J. Hudner, K. Geary, M. Erett, A. Laraba,
H. Eachempatti, J. W. Kim, H. Zhang, S. Asuncion, and Y. Frans. A 112–134-
gb/s pam4 receiver using a 36-way dual-comparator ti-sar adc in 7-nm finfet. IEEE
Solid-State Circuits Letters, 3:138–141, 2020.
[23] B. Razavi. The bridged t-coil [a circuit for all seasons]. IEEE Solid-State Circuits
Magazine, 7(4):9–13, 2015.

References 98
[24] J. Paramesh and D. J. Allstot. Analysis of the bridged t-coil circuit using the extra-
element theorem. IEEE Transactions on Circuits and Systems II: Express Briefs,
53(12):1408–1412, 2006.
[25] Naiwen Zhou, Linghan Wu, Ziqiang Wang, Xuqiang Zheng, Weidong Cao, Chun
Zhang, Fule Li, and Zhihua Wang. A 28-gb/s transmitter with 3-tap ffe and t-coil
enhanced terminal in 65-nm cmos technology. In 2016 14th IEEE International New
Circuits and Systems Conference (NEWCAS), pages 1–4, 2016.
[26] J. Savoj, K. Hsieh, P. Upadhyaya, F. An, J. Im, X. Jiang, J. Kamali, K. W. Lai,
D. Wu, E. Alon, and K. Chang. Design of high-speed wireline transceivers for
backplane communications in 28nm cmos. In Proceedings of the IEEE 2012 Custom
Integrated Circuits Conference, pages 1–4, 2012.
[27] Xilinx. 25g long reach cable link system equalization optimization. DesignCon, 2016.
[28] K. Zheng, Y. Frans, S. L. Ambatipudi, S. Asuncion, H. T. Reddy, K. Chang, and
B. Murmann. An inverter-based analog front end for a 56 gb/s pam4 wireline
transceiver in 16nmcmos. In 2018 IEEE Symposium on VLSI Circuits, pages 269–
270, 2018.
[29] Y. Frans, M. Elzeftawi, H. Hedayati, J. Im, V. Kireev, T. Pham, J. Shin, P. Upad-
hyaya, Lei Zhou, S. Asuncion, C. Borrelli, G. Zhang, Hongtao Zhang, and K. Chang.
A 56gb/s pam4 wireline transceiver using a 32-way time-interleaved sar adc in 16nm
finfet. In 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), pages 1–2, 2016.
[30] M. Erett, D. Carey, J. Hudner, R. Casey, K. Geary, P. Neto, M. Raj, S. McLeod,
H. Zhang, A. Roldan, H. Zhao, P. Chiang, H. Zhao, K. Tan, Y. Frans, and K. Chang.
A 126mw 56gb/s nrz wireline transceiver for synchronous short-reach applications
in 16nm finfet. In 2018 IEEE International Solid - State Circuits Conference -
(ISSCC), pages 274–276, 2018.

2020 Mayank Aggarwal Masc

Uploaded by

Copyright:

Available Formats

2020 Mayank Aggarwal Masc

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2020 Mayank Aggarwal Masc

Uploaded by

Copyright:

Available Formats

Low power analog front end design for 112 Gbps PAM-4

A thesis submitted in conformity with the requirements

Copyright c 2020 by Mayank Aggarwal

Master of Applied Sciences

Graduate Department of Electrical and Computer Engineering

receiver in 16 nm FinFET technology. It consists of a front-end termination block

conductance of the CMOS inverter as an amplifier provides a low power equalization

Live as if you were to die tomorrow.

Learn as if you were to live forever.

we continue to cross paths professionally and non-professionally for years to come.

lot for believing in me. ♥

1.2 Typical SERDES system . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 Data modulation techniques . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Need for equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Recent trend in the development of LR 112 Gbps SERDES . . . . . . . . 9

2.4 State of the art : Previous CTLE architectures . . . . . . . . . . . . . . 13

2.5 Thesis scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Receiver Analog Front-end Design 17

3.1 Long-reach channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2 Front-end termination network . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3 CTLE design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3.1 Basic design elements used in CTLE . . . . . . . . . . . . . . . . 24

3.3.1.1 CMOS tristate inverter as a basic amplifier unit . . . . . 24

3.3.1.2 Inverter as tunable active resistor : diode-connected load 29

3.3.1.3 Inverter as tunable active inductor . . . . . . . . . . . . 31

3.3.2 High-frequency boost stage (HF-CTLE) . . . . . . . . . . . . . . 33

3.3.3.1 Need for MF-CTLE . . . . . . . . . . . . . . . . . . . . 35

3.3.3.2 MF-CTLE design . . . . . . . . . . . . . . . . . . . . . . 43

3.3.4 Buffer stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3.5 CMFB (Common Mode Feedback Loop) . . . . . . . . . . . . . . 45

3.3.6 Complete CTLE architecture . . . . . . . . . . . . . . . . . . . . 47

3.3.7 Output buffer and back-end passive network . . . . . . . . . . . . 49

3.3.8 Top-level layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Extracted Simulation Results 53

4.1 Magnitude response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2 Pulse response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.3 Thermal noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.4 Eye diagram for NRZ (PAM-2) data signaling . . . . . . . . . . . . . . . 57

4.5 Eye diagram for PAM-4 data signaling . . . . . . . . . . . . . . . . . . . 61

4.6 Tunability to track PVT variations . . . . . . . . . . . . . . . . . . . . . 63

4.7 Common-mode bias stability . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.8 Common-mode frequency response and PSRR . . . . . . . . . . . . . . . 71

4.9 Thermal noise effect on the eye diagram . . . . . . . . . . . . . . . . . . 73

4.10 Power breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.11 Comparison of inverter based CTLE with other recent works . . . . . . . 74

4.12 Conventional CTLE design results . . . . . . . . . . . . . . . . . . . . . . 76

5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Tunability information in detail . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.1 CTLE model parameters in MATLAB. . . . . . . . . . . . . . . . . . . . 36

3.2 CMRR analysis for each stage in the CTLE. . . . . . . . . . . . . . . . . 46

4.1 PVT corners list with details. . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2 Comparison table for inverter based CTLE . . . . . . . . . . . . . . . . . 75

5.1 Tunability knobs information . . . . . . . . . . . . . . . . . . . . . . . . 86

1.1 Global data traffic over years [2]. . . . . . . . . . . . . . . . . . . . . . . 2

1.2 OIF CEI-112G Development Application Space [4]. . . . . . . . . . . . . 3

1.3 Typical SERDES transceiver system. . . . . . . . . . . . . . . . . . . . . 4

2.1 Backplane channel and CTLE magnitude response. . . . . . . . . . . . . 8

2.4 Conventional equalization architecture for data rates ≤ 56 Gbps (with

equalizers highlighted in pink colour). . . . . . . . . . . . . . . . . . . . 12