2020irds MM
2020irds MM
2020irds MM
ROADMAP
FOR
DEVICES AND SYSTEMS™
2020 UPDATE
MORE MOORE
THE IRDS IS DEVISED AND INTENDED FOR TECHNOLOGY ASSESSMENT ONLY AND IS WITHOUT REGARD TO ANY
COMMERCIAL CONSIDERATIONS PERTAINING TO INDIVIDUAL PRODUCTS OR EQUIPMENT.
Table of Contents
Acknowledgments ................................................................................................................ iii
1. Introduction .....................................................................................................................1
1.1. Current State of Technology ............................................................................................. 1
1.2. Drivers and Technology Targets ....................................................................................... 1
2. Summary and Key Points ...............................................................................................2
3. Challenges ......................................................................................................................4
3.1. Near-term Challenges....................................................................................................... 4
3.2. Long-term Challenges ...................................................................................................... 5
4. Technology Requirements—Logic Technologies............................................................5
4.1. Ground Rules Scaling ....................................................................................................... 5
4.2. Performance Boosters ...................................................................................................... 8
4.3. Performance-Power-Area (PPA) Scaling ........................................................................ 12
4.4. System-On-Chip (SoC) PPA Metrics .............................................................................. 14
4.5. Interconnect Technology Requirements.......................................................................... 16
4.6. Device Reliability ............................................................................................................ 18
4.7. 3D Heterogeneous Integration ........................................................................................ 21
4.8. Defectivity Requirements ................................................................................................ 22
5. Technology Requirements—Memory Technologies ..................................................... 23
5.1. DRAM ............................................................................................................................. 23
5.2. NVM—Flash ................................................................................................................... 23
5.3. NVM—Emerging............................................................................................................. 25
6. Potential Solutions ........................................................................................................ 28
7. Cross Teams ................................................................................................................ 29
8. Conclusions and Recommendations ............................................................................ 29
9. References ................................................................................................................... 30
List of Figures
Figure MM-1 Big data and instant data..................................................................................... 1
Figure MM-2 Projected scaling of key ground rules .................................................................. 7
Figure MM-3 Scaling of standard cell height and width through fin depopulation and device
stacking............................................................................................................... 7
Figure MM-4 Planar to GAA transition [11]. .............................................................................. 9
Figure MM-5 Evolution of device architectures in the IRDS More Moore roadmap ................... 9
Figure MM-6 Scaling trend of device S/D access resistance (Rsd) and k-value of device spacer.
[4].......................................................................................................................11
Figure MM-7 NAND2-equivalent standard cell count (left) and 111-bitcell (right) scaling in an
80mm2 die ..........................................................................................................14
Figure MM-8 Number of CPU and GPU cores in an 80mm2 die ..............................................14
Figure MM-9 CPU clock frequency and power@iso- frequency (ref: 2020) scaling .................15
Figure MM-10 Scaling projection of computation throughput of CPU cores at the maximum clock
frequency and at thermally-constrained average frequency ...............................16
Figure MM-11 Degradation paths in low-κ damascene structure ...............................................18
Figure MM-12 (left) A 3D NAND array based on a vertical channel architecture. (right) BiCS (bit
cost scalable) – a 3D NAND structure using a punch and plug process [41]. .....24
Figure MM-13 Schematic view of (a) 3D cross-point architecture using a vertical RRAM cell and
(b) a vertical MOSFET transistor as the bit-line selector to enable the random
access capability of individual cells in the array [52]. ..........................................27
List of Tables
Table MM-1 More Moore—Logic Core Device Technology Roadmap ..................................... 3
Table MM-2 More Moore—DRAM Technology Roadmap ....................................................... 3
Table MM-3 More Moore—Flash Technology Roadmap ......................................................... 4
Table MM-4 More Moore—NVM Technology Roadmap .......................................................... 4
Table MM-5 Difficult Challenges—Near-term .......................................................................... 4
Table MM-6 Difficult Challenges—Long-term .......................................................................... 5
Table MM-7 Device, PPA, and Ground Rules Roadmap for Logic Devices. ............................ 6
Table MM-8 Device Roadmap and Technology Anchors for More Moore Scaling. .................. 8
Table MM-9 Projected Electrical Specifications of Logic Core Device ....................................12
Table MM-10 Projected Performance-Power-Area (PPA) Metrics. ...........................................13
Table MM-11 Power and Performance Scaling of SoC ............................................................15
Table MM-12 Interconnect Difficult Challenges ........................................................................16
Table MM-13 Interconnect Roadmap for Scaling .....................................................................17
Table MM-14 Device Reliability Difficult Challenges ................................................................20
Table MM-15 Defectivity (D0) Requirements of an 80mm2 Die. ...............................................23
Table MM-16 Potential Solutions—Near-term ..........................................................................28
Table MM-17 Potential Solutions—Long-term ..........................................................................28
ACKNOWLEDGMENTS
MORE MOORE TEAM
U.S.A. ASIA
Anshul A. Vyas Applied Materials Atsushi Hori Tokyo Inst of Technology
Arvind Kumar IBM Digh Hisamoto Hitachi
Bhagawan Sahu Global Foundries Hajime Nakabasyashi TEL
Charles Kin P. Cheung NIST Hitoshi Wakabayashi Tokyo Inst of Technology
Chorng-Ping Chang AMAT Jiro Ida Kanazawa IT
Christopher Henderson Semitracks Kazuyuki Tomida Sony
Gennadi Bersuker Aerospace Corporation Kunihiko Iwamoro ROHM
Gerhard Klimeck Purdue Univ. Kuniyuki Kakushima Tokyo Inst of Technology
Huiming Bu IBM Masahiko Ikeno Hitachi High-Tech
James Stathis IBM Masami Hane Renesas
Jim Fonseca Purdue Univ. Shinichi Ogawa AIST
Joe Brewer Univ. Florida Shinichi Takagi University of Tokyo
Joel Barnett TEL Takashi Matsukawa AIST
Kirk Prall Micron Tesuo Endo Tohoku University
Kwok Ng SRC Tetsu Tanaka Tohoku University
Lars Liebmann TEL Toshiro Hiramoto University of Tokyo
Masako Kodera Mosos Lake Industries Yasuo Kunii Analysis Atelier
Mehdi Salmani Boston Consulting Group Yasushi Akasaka TEL
Philip Wong Stanford Univ., TSMC Yoshihiro Hayashi Keio University
Prasad Sarangapani Purdue Univ. Yuzo Fukuzaki Tech Insights
Qi Xiang Xilinx Jongwoo Park Samsung
Rich Liu Macronix Moon-Young Jeong Samsung
SangBum Kim IBM Cheng-tzung Tsai UMC
Saumitra Mehrotra NXP Geoffrey Yeap TSMC
Saurabh Sinha ARM
Sergei Drizlikh Samsung
Siddharth Potbhare NIST EUROPE
SungGeun Kim Microsoft
Christiane Le Tiec MKS Instruments
Takeshi Nogami IBM
Francis Balestra IMEP Grenoble
Wilman Tsai Stanford Univ.
Fred Kuper NXP
Witek Maszara V-tek Consulting
Gerben Doornbos TSMC
Yanzhong Xu Microsoft
Herve Jaouen ST
Jurgen Lorenz Fraunhofer IISB
Kristin DeMeyer IMEC
Laurent Le-Pailleur ST
Malgorzata Jurczak LAM Research
Mark van Dal TSMC
Matthias Passlack TSMC
Mustafa Badaroglu (chair) Huawei Technologies
Olivier Faynot LETI
Paul Mertens IMEC
Peter Ramm Fraunhofer EMFT
Robert Lander NXP
Thierry Poiroux LETI
Yannick Le Tiec LETI
MORE MOORE
1. INTRODUCTION
System scaling enabled by Moore’s scaling is increasingly challenged by the scarcity of resources such as power and
interconnect bandwidth. This has become more challenging under the requirements of seamless interaction between big
data and instant data (Figure MM-1). Instant data generation requires ultra-low-power devices with an “always-on” feature
at the same time with high-performance devices that can generate the data instantly. Big data requires abundant computing,
communication bandwidth, and memory resources to generate the service and information that clients need.
The More Moore International Focus Team (IFT) of the International Roadmap of Devices and Systems (IRDS) provides
physical, electrical, and reliability requirements for logic and memory technologies to sustain power, performance, area,
cost (PPAC) scaling for big data, mobility, and cloud (e.g., Internet-of-Things (IoT) and server) applications. This is done
over a time horizon of 15 years for mainstream/high-volume manufacturing (HVM).
• 3D integration
• Memory technologies
• DRAM technologies
• Flash technologies
• Emerging non-volatile-memory (NVM) technologies
More Moore targets bringing PPAC value for node scaling every 2−3 years [2]:
• (P)erformance: >15% more operating frequency at scaled supply voltage
• (P)ower: >30% less energy per switching at a given performance
• (A)rea: >30% less chip area footprint
• (C)ost: <30% more wafer cost – 15% less die cost for scaled die.
These scaling targets have driven the industry toward a number of major technological innovations, including material and
process changes such as high-κ gate dielectrics and strain enhancement, and in the near future, new structures such as gate-
all-around (GAA); alternate high-mobility channel materials, and new 3D integration schemes allowing heterogeneous
stacking/integration. These innovations will be introduced at a rapid pace, and hence understanding, modeling, and
implementation into manufacturing in a timely manner is crucial for the industry.
It is important to note that both cost metric (15% less die cost) and market cadence necessitating new products every year
are becoming more important targets in the mobile industry. As the applications strictly requiring all figure-of-merits
(FoMs) are concurrently met, it is necessary to advance an effective list of process technologies for sustaining certain device
architectures to their limits, such as pushing the finFET architecture for the next five years. This approach will also help in
sustaining the cost at reduced risk while moving from one logic generation to another. This becomes more difficult
whenever the cost of wafer processing becomes more expensive with the increased number of steps because of multiple
patterning lithography steps. However, we need to reduce the cost by more than 15% for the same of number of transistors,
which can only be enabled by pitch scaling enabled by new advancements in channel material, device architecture, contact
engineering, and device isolation. Increased process complexity must also be taken into account for the overall die yield.
In order to compensate the cost of complexity, acceleration in design efficiency is needed to further scale the area to reach
the die-cost scaling targets. These design-induced scaling factors were also observed in the earlier work of the System
Drivers Technology Workgroup of ITRS and those were used as calibration factors to match the area scaling trends of the
industry [2]. The design scaling factor is now considered as one of the key elements in this edition of More Moore
technology roadmap.
solutions. In addition to the resistance scalability, TDDB is putting a limit on the minimum space between the
adjacent lines for a given low-κ dielectric, forcing a slow-down in the permittivity (κ-value) scaling.
• Performance across six nodes spanning from 2020 to 2034 is forecasted to degrade by 1% node-to-node
improvement on average for wireloaded datapaths. For the nodes before 2025, it is forecasted to improve by 6%
node-to-node on average.
• System-on-chip (SoC) level area across six nodes spanning from 2020 to 2034 is forecasted to improve by 28%
node-to-node on average. For the nodes before 2025, it is forecasted to improve by 20% node-to-node on average.
• Clocking frequency at nominal supply voltage is forecasted to be improve from 3.1 GHz in 2020 to 3.5 GHz in
2025, and 2.9GHz at the end of this roadmap edition’s timeframe (in 2034). This limited scaling is because of
increasing parasitics, particularly interconnect resistance, and limited gate drive (Vgs-Vt) as a result of supply
voltage scaling. Power density poses a significant challenge for scaling, particularly as a result of 3D integration
after 2031. If the same chip is operated at the constant power density and at the target supply voltage across nodes
the average clocking will stall at 0.8 GHz in 2034. Therefore, it is necessary to factor in thermal considerations in
device and architectures.
• Energy per switching reduction is expected to be limited, about 11% reduction in a node-to-node basis on average.
This is a critical challenge of scaling because of a slow-down in capacitance and supply voltage reduction.
• DRAM needs to maintain sufficient storage capacitance and adequate cell transistor performance is required to
keep the retention time characteristic in the future. If efficiency of cost scaling becomes poor in comparison with
introducing the new technology, DRAM scaling will be stopped and 3D cell stacking structure will be adopted.
Alternatively, a new DRAM concept could be adopted.
• 2D Flash memory density cannot be increased indefinitely by continued scaling of charge-based devices because
of controllability limits of threshold voltage distribution. Flash density increase will continue by stacking memory
layers vertically, leading to adoption of 3D Flash technology. Decrease in array efficiency due to increased
interconnection and yield loss from complex processing are challenges for further reducing the cost-per-bit benefit.
Currently, 96 layers are already at volume production and there is optimism that 128 layers are achievable, with
192 and 256 layers possible.
• Ferroelectric RAM (FeRAM) is a fast, low power, and low voltage non-volatile memory (NVM) alternative and
thus is suitable for radio frequency identification (RFID), smart card, ID card, and other embedded applications.
Processing difficulty limits its wider adoption. Recently, HfO2-based ferroelectric field-effect transistor (FET), for
which the ferroelectricity serves to change the threshold voltage (Vt) of the FET and thus can form a 1T cell similar
to Flash, has been proposed. If developed to maturity, this may serve as a low-power and very fast, Flash-like
memory.
• Spin-transfer torque-magnetic RAM (STT-MRAM) to replace the stand-alone NAND Flash seems remote.
However, its SRAM-like performance and much smaller footprint than the conventional 6T-SRAM have gained
much interest in that application, especially in mobile devices that do not require high cycling endurance.
Therefore, STT-MRAM is now mostly considered not as a standalone memory but an embedded memory. STT-
MRAM would also be a potential solution for embedded Flash (NOR) replacement. This may be particularly
interesting for low-power IoT applications. On the other hand, for other embedded systems applications using
higher memory density, NOR Flash is expected to continue to dominate, since it is still substantially more cost-
effective and well established for being able to endure the printed circuit board (PCB) soldering process (at
~250°C) without losing its preloaded code.
• 3D crosspoint memory has been demonstrated for the storage class memory (SCM) to improve I/O throughput and
reduce power and cost. Since the memory including the selector device is completely fabricated in the back-end-
of-line (BEOL) process it is relatively inexpensive to stack multiple layers to reduce bit cost.
• High-density resistive RAM (ReRAM) development has been limited from the lack of a good selector device,
since simple diodes have limited operation ranges. Recent advances in 3D cross point memory, however, seem to
have solved this bottleneck and ReRAM could make rapid progress if other technical issues, such as erratic bits,
are solved.
The links to the tables of technology roadmaps for Logic Core Device, DRAM, Flash, and NVM are below:
Table MM-1 More Moore—Logic Core Device Technology Roadmap
Table MM-2 More Moore—DRAM Technology Roadmap
3. CHALLENGES
The goal of the semiconductor industry is to be able to continue to scale the technology in overall performance at reduced
power and cost. The performance of the components and the final chip can be measured in many different ways: higher
speed, higher density, lower power, more functionality, etc. Traditionally, dimensional scaling had been adequate to bring
about these aforementioned performance merits, but it is no longer the case. Processing modules, tools, material properties,
etc., are presenting difficult challenges to continue scaling. We have identified these difficult challenges and summarized
in Table MM-5 and Table MM-6. These challenges are divided into near-term 2020-2025 (Table MM-5) and long-term
2026-2034 (Table MM-6).
3.1. NEAR-TERM CHALLENGES
Table MM-5 Difficult Challenges—Near-term
Near-Term
Challenges: Description
2020-2025
Power scaling Voltage and capacitance scaling slow down and lack of solutions for power reduction.
Introduction of gate-all-around (GAA) devices is a remedy to reduce the supply voltage, but not in a
sustained manner that allows continuous scaling. Power scaling is also limited because of slow-down
of loading capacitance scaling. This loading capacitance is becoming increasingly impacted by the
parasitic components of the device with continuous scaling of ground rules. Therefore, an introduction
of low-κ materials, design-technology-co-optimization (DTCO) introducing new contact access
schemes, as well as local interconnect schemes that allow lower parasitics, is needed.
Integration Bitcell scaling is slowing down because of the slow-down of the device (e.g., fin) pitch and gate pitch
enablement for (i.e., contacted poly pitch (CPP)).
SRAM-cache
New device schemes such as P-over-N stacked device or vertical devices bring an opportunity to
applications
significantly reduce the SRAM area. This is enabled because of optimized layouts that eliminate the
critical design rules impacting the area.
Option of embedded NVM in high-performance logic.
Being able to integrate most of emerging memories (e.g., MRAM) at the interconnect stack also bring
an opportunity for high-density memories. However, the stack as well as the materials should be
compatible with BEOL.
Near-Term
Challenges: Description
2020-2025
there is a need for new barrier materials and Cu alternative solutions. In addition to resistance
scalability, TDDBis putting a limit on the minimum space between the adjacent lines for a given low-
κ dielectric.
Power scaling Power scaling—no solutions are left besides steep-subthreshold (SS) devices to enable
complementary SoC functions bringing power reduction but replacing mainstream CMOS.
However, most of steep-SS device candidates do not bring an adequate performance comparable to
CMOS at nominal supply voltages. In order to maximize the performance of steep-SS device, new
architectures are necessary to attain the performance through parallelization.
Use cases of Performance scaling and functional diversification with vertical devices and new architectures.
vertical device
Using vertical devices at conventional logic and architectures will raise routing congestion and
structures
increased parasitics. There is a need for new logic schemes and architectures that maximize the
advantage of 3D capability.
Thermal issue due Thermal challenges (e.g., power density and dark silicon) of 3D stacking.
to increased
Gate-all-around (GAA) devices have limited heat conductance due to confined architecture.
power density
Increased pin density due to aggressive standard cell height scaling and increased drive by stacked
devices put a significant pressure on the power density.
Integration of Adoption of non-Cu interconnects for low-resistance, meeting EM/TDDB, and temperature budget
non-Cu compatibility with devices used in 3D integration.
metallization to
replace Cu
remedy to pattern-tight ground rules in fewer process steps. The projected roadmap of ground rules as well as device
architectures is shown in Table MM-7. Evolution of ground rules in shown Figure MM-2. There is not yet a consensus on
the node naming across different foundries and integrated device manufacturers (IDMs); however, the projected rules give
an indication of technology capabilities in line with the PPAC requirements. Key parameters in the ground rules are the
gate pitch, metal pitch, fin pitch, and gate length, which are important factors in core logic area scaling.
Table MM-7 Device, PPA, and Ground Rules Roadmap for Logic Devices.
Note: GxxMxxTx notation refers to Gxx: contacted gate pitch, Mxx: tightest metal pitch in nm, Tx: number of tiers. This notation illustrates the
technology pitch scaling capability. On top of pitch scaling there are other elements such as cell height, fin depopulation, DTCO constructs, 3D
integration, etc. that define the target area scaling (gates/mm2).
Acronyms used in the table (in order of appearance): LGAA—lateral gate-all-around-device (GAA), 3DVLSI—fine-pitch 3D logic sequential
integration.
Ground rule scaling alone is not adequate to scale the cell height. It is necessary to bring the design scaling factor into
practice [2][3]. For example, standard cell height will be further reduced by scaling the number/width of active devices in
the standard cell as well as scaling the secondary rules such as tip-to-tip, extension, P-N separation, and minimum area
rules. Similarly, the standard cell width can be reduced by focusing on critical design rules such as fin termination at the
edge fin, etc., and enabling structures such as contact-over-active [4][5][6]. Also, the contact structure needs to be carefully
selected to reduce the risk of increased current density at the junctions. It is expected that in 2028 P and N devices could be
stacked on top of each other allowing a further reduction. This trend in standard cell scaling is shown in Figure MM-3.
Optimized Cell,
Contacts Over Active,
Single Diffusion Break,
Taller Fin GAA Device
Fin
Stacked P-over-N Gate
M0
TrenchContact
V-1
2018-2025 >2028
<2018
<2015
Standard cell architecture evolution
Figure MM-3 Scaling of standard cell height and width through fin depopulation and device stacking
After 2031 there is no room for 2D geometry scaling, where 3D very large scale integration (VLSI) of circuits and systems
using sequential/stacked integration approaches will be necessary. This is due to the fact that there is no room for contact
placement as well as worsening performance as a result of gate pitch scaling and metal pitch scaling. It is projected that
physical channel length would saturate around 12nm due to worsening electrostatics while gate pitch would saturate at
38nm to reserve sufficient width (~14nm) for the device contact, providing acceptable parasitics. 3D VLSI expects to bring
PPAC gains for the target node as well as to pave ways for heterogeneous and/or hybrid integration. The challenge of such
integration in 3D is how to partition the system to come up with better utilization of devices, interconnects, and sub-systems
such as memory, analog, and I/O. That is why the functional scaling and/or significant architectural changes are required
after 2031. This would potentially be the time where Beyond CMOS and specialty technology devices/components would
bring up the system scaling towards high system performance at unit power density and at unit cube volume
FinFET still remains the key device architecture that could sustain scaling until 2025 [4][6]. Electrostatics and fin
depopulation (i.e., increasing fin height while reducing number of fins at unit footprint area) remain as the two effective
solutions to improve performance. Parasitics improvement is expected to stay as a major knob for performance improvement
as a result of tightening design rules. It is forecasted that the parasitics will remain as a dominant term in the performance
of critical paths. For reduced supply voltage, a transition to GAA structures such as lateral nanosheets would be necessary
to sustain the gate drive by improved electrostatics [8]. Lateral GAA structure would eventually evolve in hybrid form with
the vertical GAA structure to gain back the performance loss due to increasing parasitics at tighter pitches as well as for
specialized SoC functions such as memory selector. Sequential integration would allow stacking of devices on top of each
other with the adoption of monolithic 3D (M3D) integration [9]. Scaling focus will shift from single-thread performance
gain to power reduction and then evolve onto highly-parallel 3D architectures allowing low Vdd operation and more
functions embedded at unit cube volume.
While device architectures are seeing changes, subsequent modules are expected to also evolve. These may include: 1)
starting substrates such as Si to silicon-on-insultator (SOI) and strain-relaxation-buffer (SRB); 2) channel material evolving
from Si to SiGe, Ge, IIIV; 3) contact module evolving from silicides to novel materials providing lower Schottky barrier
height (SBH) and to wrap-around contact integration schemes to increase the contact surface area. Below is a list of these
schemes.
4.2.1. Transition to new device architectures
As mentioned earlier finFET is likely to sustain until 2025. Beyond 2022 a transition to lateral GAA devices is expected to
start and potentially include vertical GAA devices in hybrid form with the lateral GAA, potentially for 3D hybrid memory-
on-logic applications. This situation would be due to the limits of fin-width scaling (saturating the Lgate scaling to sustain
the electrostatics control) and contact width. Parasitic capacitance penalty, effective drive width (Weff), and replacement
metal gate (RMG) integration pose challenges in GAA adoption. One compromise solution could be the electrically GAA
(EGAA) architecture with much reduced parasitic capacitance and increased effective width for better short channel control
and stronger drive [10]. Projected evolution of device architectures is shown in Figure MM-5 and Figure MM-5.
Figure MM-5 Evolution of device architectures in the IRDS More Moore roadmap
exponential decay of the metal induced gap states (MIGS) inducing charge density accumulation in the bandgap of the
dielectric.
4.2.6. Reducing parasitic device capacitance
Parasitic capacitance between gate and source/drain terminal of the device is expected to increase with technology scaling.
In fact, this component is getting more important than channel-capacitance-related loading whenever the standard cell
context is considered and even more elevated in the GAA structures as a result of unused space between stacked devices.
There is a need to focus on low-κ spacer materials and even air spacer. Those still need to provide good reliability and etch
selectivity for S/D contact formation [21][22]. Also, there are significant limits in increasing finFET or lateral GAA device
AC performance by increasing the height of the device (fin/nanosheet stack). Energy per switch vs. delay relationship seems
to quickly saturate and then decline with increasing device height. Scaling trend of key parasitic improvements is shown
Figure MM-4.
Figure MM-6 Scaling trend of device S/D access resistance (Rsd) and k-value of device spacer.[4]
Note: Rsd is the total parasitic series resistance (source plus drain) per micron of MOSFET width. These values include
all components such as accumulation layer, spreading resistance, sheet resistance, and contacts. It is assumed that there
is 15% improvement per each node cycle (every 2 years or 3 years).
Tdel=0.69*Rdr*Cint + (0.69*Rdr+0.38*Rw)*Cw+0.69*(Rdr+Rw)*Cout
where Rdr is the resistance of driver, Cint is the capacitance seen at the output of driver, Rw is the wire resistance, Cw is
the wire capacitance, and Cout is the load capacitance due to the gates connected to the load. For logic technologies beyond
10nm the dominant term is typically found to be Rw*Cout [2]. This means that increasing the driver strength does not help
if there is no improvement in the parasitic resistance of interconnect and/or a reduction in the parasitic loading of standard
cell.
It is also possible to extract circuit-level parameters such as delay and power per stage with the use of targeted compact
models, e.g., virtual source model (VSM), which is open source distribution from MIT [28]. Details of this modeling and
how interconnect is coupled with the device in the standard-cell context are explained in [2].
Projected scaling of PPA metrics as well as the standard cell and bitcell layout characteristics (e.g., number of active devices,
Weff, etc) are shown in Table MM-10.
Table MM-10 Projected Performance-Power-Area (PPA) Metrics.
Performance scaling across six nodes spanning from 2020 to 2034 is projected to be decaying by 1% node-to-node
improvement for datapaths with wireload because of the negative impact of wire resistance on performance, particularly
after 2025. For the nodes before 2025, it is forecasted to have 6% node-to-node performance improvement. We also take
into account the wirelength reduction as function of area scaling translating into the reduction of wire-related loading
capacitance and resistance. Wirelength is expected to further reduce as a result of 3DVLSI after 2031.
Energy per switching reduction is forecasted to become limited, about 11% reduction on a node-to-node basis on
average.This is mostly achieved by fin/device depopulation, which also enables the cell height reduction bringing a scaling
of wire and cell related capacitances. We also consider that DTCO constructs such as contact-over-active, single diffusion
break, etc., as described in [4][10], will further reduce the standard cell width in 2020, ×0.9 relative to the 2018 reference
on top of conventional gate pitch scaling. Routed gate density is improved by around ×1.3 on a node-to-node basis until
2025. After 2031 it is expected that 3D scaling by sequential/stacked integration (full-scale 3DVLSI) would further maintain
the scaling of the number of functions per unit cube.
Figure MM-7 NAND2-equivalent standard cell count (left) and 111-bitcell (right) scaling in an 80mm2 die
Projected power and performance scaling of SoC is given in Table MM-11. Clock frequency is projected to mildly improve
to 3.5 GHz (Figure MM-9) in 2025 because of increasing parasitics and limited gate drive (Vgs-Vt) as function of scaling.
After 2028 CPU clock frequency worsens due to increased parasitics in standard cell and wiring, despite the fact that 3D-
VLSI helps in scaling the wirelengths due to the area reduction of digital block through the split of cells in 3D. Also, thermal
(increasing power density) constraints reduce the average frequency down to 0.8 GHz at the end of the roadmap edition
timeframe. Basically, if nothing is done for the mitigation of thermal issues, the CPU needs to be throttled more frequently
to maintain the same power density. The rate of power reduction tends to flatten because of slow-down in supply voltage
(Vdd) and slow-down of capacitance scaling towards the end of roadmap (Figure MM-9). Potential solutions of thermal
challenges raise an opportunity to maintain an overall computational throughput scaling of ×14 over six node generations
until 2034 instead of ×3.8 if the system is fully thermal-constrained (Figure MM-10). This view on power-constrained CPU
throughput scaling was also discussed by the ITRS System Drivers Technology Workgroup [29].
Table MM-11 Power and Performance Scaling of SoC
Figure MM-9 CPU clock frequency and power@iso- frequency (ref: 2020) scaling
Figure MM-10 Scaling projection of computation throughput of CPU cores at the maximum clock frequency and at
thermally-constrained average frequency
Metrology—Three-dimensional Line edge roughness, trench depth and profile, via shape, etch bias, thinning due
control of interconnect features to cleaning, CMP effects. The multiplicity of levels, combined with new
(with its associated metrology) will materials, reduced feature size and pattern dependent processes, use of
be required alternative memories, optical and RF interconnect, continues to challenge.
As features shrink, etching, cleaning, and filling high aspect ratio structures will
Process—Patterning, cleaning, and
be challenging, especially for low-κ dual damascene metal structures and DRAM
filling at nano-dimensions
at nano-dimensions.
Complexity in Integration— Combinations of materials and processes used to fabricate new structures create
Integration of new processes and integration complexity. The increased number of levels exacerbate
structures, including interconnects thermomechanical effects. Novel/active devices may be incorporated into the
for emerging devices interconnect.
4.5.1. Conductor
Copper (Cu) is expected to remain to be the preferred solution for the interconnect metal, at least until 2025 while non-Cu
solutions (e.g. Co and Ru) are expected to be used for the local interconnect (M0). On the other hand, due to limits of
electromigration, the local interconnect (middle-of-line (MOL)), M1, and Mx levels will embed non-Cu solutions such as
Cobalt (Co), particularly for the via, due to its better integration window to fill the narrow trenches on top of the EM
performance as well as its lower resistance compared to Cu at scaled dimensions. Although a resistivity increase due to
electron scattering in Cu or higher bulk resistivity in non-Cu solutions (e.g., Co) are already apparent, a hierarchical wiring
approach such as scaling of line length along with that of the width still can overcome the problem.
4.5.2. Barrier Metal
Cu wiring barrier materials must prevent Cu diffusion into the adjacent dielectric but also must form a suitable, high quality
interface with Cu to limit vacancy diffusion and achieve acceptable electromigration lifetimes. Ta(N) is a well-known
industry solution. Although the scaling of Ta(N) deposited by plama vapor deposition (PVD) is limited, other nitrides such
as Mn(N) that can be deposited by chemical vapor deposition (CVD) or atomic layer deposition (ALD) have recently
attracted attention. As for the emerging materials, self-assembled monolayers (SAMs) are researched as the candidates for
future generation.
massive technology changes. However, there are also niche markets that require reliability levels to improve. Applications
that require higher reliability levels, harsher environments, and/or longer lifetimes are more difficult than the mainstream
office and mobile applications. Note that a constant overall chip reliability levels requires a continuous improvement in the
reliability per transistor because of scaling. Meeting reliability specifications is a critical customer requirement and failure
to meet reliability requirements can be catastrophic.
4.6.1. Device reliability difficult challenges
Table MM-14 indicates the top near-term reliability challenges. The first near-term reliability challenge concerns failure
mechanisms associated with the MOS transistor. The failure could be caused by either breakdown of the gate dielectric or
threshold voltage change beyond the acceptable limits. The time to a first breakdown event is decreasing with scaling. This
first event is often a “soft” breakdown. Depending on the circuit it may take more than one soft breakdown to produce an
IC failure, or the circuit may function for longer time until the initial “soft” breakdown spot has progressed to a “hard”
failure. Threshold voltage-related failure is primarily associated with the negative bias temperature instability observed in
p channel transistors in the inversion state and the analogous positive bias temperature instability in n channel transistors.
Burn-in options to enhance reliability of end-products may be impacted, as it may accelerate negative bias temperature
instability (NBTI) shifts.
ICs are used in a variety of different applications. There are some special applications for which reliability is especially
challenging. First, there are the applications in which the environment subjects the ICs to stresses much greater than found
in typical consumer or office applications. For example, automotive, military, and aerospace applications subject ICs to
extremes in temperature and shock. In addition, aviation and space-based applications also have a more severe radiation
environment. Furthermore, applications like base stations require IC’s to be continuously on for tens of years at elevated
temperatures, which makes accelerated testing of limited use. Second, there are important applications (e.g., implantable
electronics, safety systems) for which the consequences of an IC failure are much greater than in mainstream IC
applications. In general, scaled-down ICs are less “robust” and this makes it harder to meet the reliability requirements of
these special applications.
At the heart of reliability engineering is the fact that there is a distribution of lifetimes for each failure mechanism. With
low failure rate requirements, we are interested in the early-time range of the failure time distributions. There has been an
increase in process variability with scaling (e.g., distribution of dopant atoms, chemical mechanical polishing (CMP)
variations, and line-edge roughness). At the same time the size of a critical defect decreases with scaling. These trends will
translate into an increased time spread of the failure distributions and, thus, a decreasing time to first failure. We need to
develop reliability engineering software tools (e.g., screens, qualification, and reliability-aware design) that can handle the
increase in variability of the device physical properties, and to implement rigorous statistical data analysis to quantify the
uncertainties in reliability projections. The use of Weibull and log-normal statistics for analysis of breakdown reliability
data is well established, however, the shrinking reliability margins require a more careful attention to statistical confidence
bounds in order to quantify risk. This is complicated by the fact that new failure physics may lead to significant and
important deviations from the traditional statistical distributions, making error analysis non-straightforward. Statistical
analysis of other reliability data such as bias temperature instability (BTI) and hot carrier degradation is not currently
standardized in practice but may be needed for accurate modeling of circuit failure rate.
Reliability due to material, • TDDB, negative BTI (NBTI), positive BTI (PBTI), hot carrier injection (HCI),
process, and structural random telegraphic noise (RTN) in scaled non-planar devices
changes, and novel • Gate to contact breakdown
applications. • Increasing statistical variation of intrinsic failure mechanisms in scaled non-
planar devices
• 3D device structure reliability challenges
• Reduced reliability margins drive need for improved understanding of reliability
at circuit level
• Reliability of embedded electronics in extreme or critical environments
(medical, automotive, grid...)
Long-Term 2026-2034 Summary of issues
Reliability of novel devices, • Understand and control the failure mechanisms associated with new materials and
structures, and materials. device structures
• Shift to system level reliability perspective with unreliable devices
• Muon induced soft error rate
The single long-term reliability difficult challenge concerns novel, disruptive changes in devices, structures, materials, and
applications. For such disruptive solutions there is at this moment little, if any, reliability knowledge (as least as far as their
application in ICs is concerned). This will require significant efforts to investigate, model (both a statistical model of
lifetime distributions and a physical model of how lifetime depends on stress, geometries, and materials), and apply the
acquired knowledge (new building-in reliability, designing-in reliability, screens, and tests). It also seems likely that there
will be less-than-historic amounts of time and money to develop these new reliability capabilities. Disruptive material or
devices therefore lead to disruption in reliability capabilities and it will take considerable resources to develop those
capabilities.
4.6.2. Device reliability potential solutions
The most effective way to meet requirements is to have complete built-in-reliability and design-for-reliability solutions
available at the start of the development of each new technology generation. This would enable finding the optimum
reliability/performance/power choice and would enable designing a manufacturing process that can consistently have
adequate reliability. Unfortunately, there are serious gaps in these capabilities today and these gaps are likely to grow even
larger in the future. The penalty will be an increasing risk of reliability problems and a reduced ability to push performance,
cost and time-to-market.
It is commonly thought that the ultimate nanoscale device will have a high degree of variation and high percentage of non-
functional devices right from the start. This is viewed as an intrinsic nature of devices at the nanoscale. As a result, it will
not be possible any longer for designer to take into account a ‘worst case’ design window, because this would jeopardize
the performance of the circuits too much. To deal with it, a complete paradigm change in circuit and system design will
therefore be needed. While we are not there yet, the increase in variability is clearly already a reliability problem that is
taxing the ability of most manufacturers. This is because variability degrades the accuracy of lifetime projection, forcing a
dramatic increase in the number of devices tested. The coupling between variability and reliability is squeezing out the
benefit of scaling. At some point, perhaps before the end of the roadmap, the cost of ensuring each and every one of the
transistors in a large integrated circuit to function within specification may become too high to be practical. As a result, the
fundamental philosophy of how to achieve product reliability may need to be changed. This concept is known as resilience,
the ability to cope with stress and catastrophe. One potential solution would be to integrate so-called solutions and monitors
in the circuits that are sensing circuit parts that are running out of performance and then during runtime can change the
biasing of the circuits. Such solutions need to be further explored and developed. Ultimately, circuits that can dynamically
reconfigure itself to avoid failing and failed devices (or to change/improve functionality) will be needed.
The growing complexity of a reliability assessment due to proliferation of new materials; gate stack compositions tuned to
a variety of specific applications; as well as shorter cycle for process development, may be alleviated to some degree by
greater use of the physics-based microscopic reliability models, which are linked to material structure simulations and
consider degradation processes on atomic level. Such models, a need for which is slowly getting wider recognition, will
reduce our reliance on statistical approach, which is both expensive and time consuming, as discussed above. These models
can provide additional advantage due to the fact that they can be incorporated in compact modeling tools with relative ease
and require only a limited calibration prior to being applied to a specific product.
Some small changes may already be underway quietly. A first step may be simply to fine-tune the reliability requirements
to trim out the excess margin, perhaps even having product-specific reliability specifications. More sophisticated
approaches involve fault-tolerant design, fault-tolerant architecture, and fault-tolerant systems. Research in this direction
has increased substantially. However, the gap between device reliability and system reliability is very large. There is a
strong need for device reliability investigation to address the impact on circuits. Recent increase in using circuits such as
SRAM and ring oscillator to look at many of the known device reliability issue is a good sign, as it addresses both the issues
of circuit sensitivity as well as variability. More device reliability research is needed to address the circuit and perhaps
system aspects. For example, most of the device reliability studies are based on quasi-DC measurements. There is no
substantial research on the impact of degradation on devices at circuit operation speed. This gap in measurement speed
makes modeling the impact of device degradation on circuit performance difficult and risky.
In the meantime, we must meet the conventional reliability requirements. That means an in-depth understanding of the
physics of each failure mechanism and the development of powerful and practical reliability engineering tools. Historically,
it has taken many years (typically a decade) before the start of production for a new technology generation to develop the
needed capabilities (R&D is conducted on characterizing failure modes, deriving validated, predictive models and
developing design for reliability and reliability TCAD tools.) The ability to qualify technologies has improved, but there
still are significant gaps.
For the reliability capabilities to catch up requires a substantial increase in reliability research-development-application and
cleverness in acquiring the needed capabilities in much less than the historic time scales. Work is needed on rapid
characterization techniques, validated models, and design tools for each failure mechanism. The impact of new materials
like alternate channel material needs particular attention. Breakthroughs may be needed to develop design for reliability
tools that can provide a high-fidelity simulation of a large fraction of an IC in a reasonable time. As mentioned above,
increased reliability resources also will be needed to handle the introduction of a large number of major technology changes
in a brief period of time.
The needs are clearly many, but a specific one is the optimal reliability evaluation methodology, which would deliver
relevant long-term degradation assessment while avoiding excessive accelerated testing that may produce misleading
results. This need is driven by the decreasing process margin and increasing variability, which greatly degrades the accuracy
of lifetime projection from a standard sample size. The ability to stress a large number of devices simultaneously is highly
desirable, particularly for long term reliability characterization. Doing it at manageable cost is a challenge that is very
difficult to meet and becoming more so as we migrate to more advanced technology nodes. A break-through in testing
technology is badly needed to address this problem.
4.7. 3D HETEROGENEOUS INTEGRATION
Every logic generation needs to add new functions in each node to keep unit price constant (to preserve profit margins).
This is getting more difficult due to the following challenges:
• Fewer functions left on board/system to co-integrate
• Heterogeneous cores specialized per function—specialized performance improvement requirements needed per
each dedicated core
• Off-package memories—costly to co-integrate with logic, technology not compatible with baseline CMOS (where
wafer/die-level stacking might be needed)
Die cost reduction has been enabled so far by concurrent scaling of gate pitch, metal pitch, and cell height scaling. This is
expected to continue until 2028. Cell height scaling would likely be pursued by 3D devices (e.g., finFET and lateral GAA)
and DTCO constructs in cell and physical design. However, this scaling route is expected to be more challenged by
diminishing electrical/system benefits and also by diminishing area-reduction/$ at SoC level. Therefore, it is necessary to
pursue 3D integration routes such as device-over-device stacking, fine-pitch layer transfer, and/or monolithic 3D (or
sequential integration). These pursuits will maintain system performance and power gains while potentially maintaining the
cost advantages such as treating expensive non-scaled components somewhere else and using the best technology fit per
tier functionality.
3DVLSI can be routed either at gate or transistor levels. 3DVLSI offers the possibility to stack tiers enabling high-density
contacts at the tier level (up to several million vias per mm²). The partitioning at the gate level allows IC performance gain
due to wire length reduction while partitioning at the transistor level by stacking nFET over pFET (or the opposite), enabling
the independent optimization of both types of transistors (customized implementation of channel material/substrate
orientation/channel and raised source/drain strain, etc. [8][29]) while enabling reduced process complexity compared to a
planar co-integration, for instance the stacking of III-V nFETs above SiGe pFETs [27][30]. These high mobility transistors
are well suited for 3DVLSI because their process temperatures are intrinsically low. 3DVLSI, with its high contact density,
can also enable applications requiring heterogeneous co-integration with high-density 3D vias, such as NEMS with CMOS
for gas sensing [31][32] or highly miniaturized imagers [33]. There is a significant momentum on integrating device-on-
device stacking (e.g. P device over N) to decouple the channel engineering (e.g. Ge channel for PMOS) for better
performance [34].
In order to address the transition from 2D to 3DVLSI, the following generations are projected in the roadmap:
• Die-to-wafer and wafer-to-wafer stacking
o Approach: Fine-pitch dielectric/hybrid bonding and/or flip-chip assembly
o Opportunities: Reducing bill-of-materials on the system, heterogeneous integration, high-bandwidth, and
low latency memory on logic
o Challenges: Design/architecture partitioning
• Device-on-device (e.g., P-over-N stacking)
o Approach: Sequential integration
o Opportunities: Reducing 2D footprint of standard cell and/or bit cell
o Challenges: Minimizing interconnect overhead is key between N&P enabling low-cost
• Adding logic 3D SRAM and/or MRAM stack (embedded/stacked)
o Approach: Sequential integration and/or wafer transfer
o Opportunities: 2D area gain, better connection between logic and memory enabling system latency gains.
o Challenges: Solving the thermal budget of interconnect at the lower tier if stacking approach is used,
Revisiting the cache hierarchy and application requirements, power, and clock distribution
• Adding Analog and I/O
o Approach: Sequential integration and/or wafer transfer
o Opportunities: Giving more freedom to designer and allows integration of high-mobility channels,
pushing non-scaling components to another tier, IP re-use, scalability, IO voltage enablement in advanced
nodes
o Challenges: Thermal budget, reliability requirements, power and clock distribution
• True-3D VLSI: Clustered functional stacks
o Approach: Sequential integration and/or wafer transfer
o Opportunities: Complementary functions other than CMOS replacement such as neuromorphic, high-
bandwidth memory or pure logic applications incorporating new data-flow schemes favoring 3D
connecting. Application examples include image recognition in neuromorphic fabric, wide-IO sensor
interfacing (e.g., DNA sequencing, molecular analysis), and highly-parallel logic-in-memory
computations.
o Challenges: Architecting the application where low energy at low frequency and highly-parallel interfaces
could be utilized, mapping applications to non-Von Neumann architectures.
of hundreds of Gb in a chip. Nonvolatile memory may be divided into two large categories—Flash memories (NAND Flash
and NOR Flash), and non-charge-based-storage memories. Nonvolatile memories are essentially ubiquitous, and a lot of
applications use embedded memories that typically do not require leading edge technology nodes. The More Moore
nonvolatile memory tables only track memory challenges and potential solutions for leading edge standalone parts.
Flash memories are based on simple one transistor (1T) cells, where a transistor serves both as the access (or cell selection)
device and the storage node. At this time Flash memory serves more than 99% of applications.
When the number of stored electrons reaches statistical limits, even if devices can be further scaled and smaller cells
achieved, the threshold voltage distribution of all devices in the memory array becomes uncontrollable and logic states
unpredictable. Thus memory density cannot be increased indefinitely by continued scaling of charge-based devices.
However, effective density increase may continue by stacking memory layers vertically.
The economy of stacking by completing one device layer then another and so forth is questionable. As depicted in Figure
MM-12 [41], the cost per bit starts to rise after stacking several layers of devices. Furthermore, the decrease in array
efficiency due to increased interconnection and yield loss from complex processing may further reduce the cost-per-bit
benefit of this type of 3D stacking. In 2007, a ‘punch and plug’ approach was proposed to fabricate the bit line string
vertically to simplify the processing steps dramatically [41]. This approach makes 3D stacked devices in a few steps and
not through repetitive processing, thus promised a new low-cost scaling path to NAND flash. Figure MM-12 illustrates one
such approach. Originally coined bit-cost-scalable, or BiCS, this architecture turns the NAND string by 90 degrees from a
horizontal position to vertical. The word line (WL) remains in the horizontal planes. As depicted in Figure MM-12, this
type of 3D approach is much more economical than the stacking of complete devices, and the cost benefit does not saturate
up to quite high number of layers.
Figure MM-12 (left) A 3D NAND array based on a vertical channel architecture. (right) BiCS (bit cost scalable) –
a 3D NAND structure using a punch and plug process [41].
A number of architectures based on the BiCS concept have been proposed since 2007 and several, including some that use
floating gate instead of charge trapping layer for storage, have gone into volume production in the last 2−3 years. In general,
all 3D NAND approaches have adopted a strategy of using much larger areal footprints than the conventional 2D NAND.
The x- and y- dimensions (equivalent to cell size in 2D) of 3D NAND are in the range of 100nm and higher compared to
~15nm for the smallest 2D NAND. The much larger “cell size” is made up by stacking a large number of memory layers
to achieve competitive packing density.
The economics of 3D NAND is further confounded by its complex and unique manufacturing needs. Although the larger
cell size seems to relax the requirement for fine line lithography, to achieve high data rate it is desirable to use large page
size and this in turn translates to fine pitched bit lines and metal lines. Therefore, even though the cell size is large metal
lines still require ~20nm half-pitch that is only achievable by 193i lithography with double patterning. Etching of deep
holes is difficult and slow, and the etching throughput is generally very low. Depositing many layers of dielectric and/or
polysilicon, as well as metrology for multilayer films and deep holes all challenge unfamiliar territories. These all translate
to large investment in new equipment and floor space and new challenges for wafer flow and yield.
The ultimate unknown is how many layers can be stacked. There seems no hard physics limit on the stacking of layers.
Beyond certain aspect ratio (100:1 perhaps?) the etch-stop phenomenon, when ions in the reactive ion etching process are
bent by electrostatic charge on the sidewall and cannot travel further down, may limit how many layers can be etched in
one operation. However, this may be bypassed by stacking fewer layers, etching, and stacking more layers (at higher cost).
Stacking many layers may produce high stress that bends the wafer and although this needs to be carefully engineered it
does not seem to be an unsolvable physics limit. Even at 200 layers (at ~50nm for each layer) the total stack height is about
10µm, which is still in the same range as 10−15 metal layers for logic IC’s. This kind of layer thickness does not
significantly affect bare die thickness (thinnest is about 40µm so far) yet. However, at 1000 layers the total layer thickness
may cause thick dies that do not conform to the form factor for stacking multiple dies (e.g., 16 or 32) in a thin package. At
this time, 96 layers are in volume production and there is optimism that 128 layers are achievable and even 192 and 256
layers are possible.
Renewed shrinking of the areal x-y footprint may eventually start when stacking more layers proves to be too difficult.
However, such a trend is not guaranteed. If the hole aspect ratio is the limitation, shrinking the footprint would not reduce
the ratio and would thus not be helpful. Furthermore, the larger cell size seems to at least partially contribute to the better
performance of 3D NAND (speed and cycling reliability) compared to tight-pitch 2D NAND. Whether x-y scaling can still
deliver such performance is not clear. Probably new innovation or a more powerful emerging memory will be needed to
further reduce bit cost.
5.3. NVM—EMERGING
Since 2D NAND Flash scaling is limited by statistical fluctuation due to too few stored charges, several non-conventional
non-volatile memories that are not based on charge storage (ferroelectric or FeRAM, magnetic or MRAM, phase-change
or PCRAM, and resistive or ReRAM) are being developed and form the category often called “emerging” memories. Even
though 2D NAND is being replaced by 3D NAND (that is no longer subject to the drawback of too few electrons) some
characteristics of non-charge based emerging memories, such as low voltage operation, or random access, are attractive for
various applications and thus continue to be developed. These emerging memories usually have a two-terminal structure
(e.g., resistor or capacitor) thus are difficult to also serve as the cell-selection device. The memory cell generally combines
a separate access device in the form of 1T-1C, 1T-1R, or 1D-1R.
5.3.1. FeRAM
FeRAM devices achieve non-volatility by switching and sensing the polarization state of a ferroelectric capacitor. To read
the memory state the hysteresis loop of the ferroelectric capacitor must be traced and the stored datum is destroyed and
must be written back after reading (destructive read, like DRAM). Because of this ‘destructive read’ it is a challenge to find
ferroelectric and electrode materials that provide both adequate change in polarization and the necessary stability over
extended operating cycles. Many ferroelectric materials are foreign to the normal complement of CMOS fabrication
materials, and can be degraded by conventional CMOS processing conditions. FeRAM is fast, low power, and low voltage
and thus is suitable for RFID, smart card, ID card, and other embedded applications. Processing difficulty limits its wider
adoption. Recently, HfO2 based ferroelectric FET, for which the ferroelectricity serves to change the Vt of the FET and
thus can form a 1T cell similar to Flash memory, has been proposed. If developed to maturity this new memory may serve
as a low power and very fast Flash-like memory.
5.3.2. MRAM
Magnetic RAM (MRAM) devices employ a magnetic tunnel junction (MTJ) as the memory element. An MTJ cell consists
of two ferromagnetic materials separated by a thin insulating layer that acts as a tunnel barrier. When the magnetic moment
of one layer is switched to align with the other layer (or to oppose the direction of the other layer) the effective resistance
to current flow through the MTJ changes. The magnitude of the tunneling current can be read to indicate whether a ONE
or a ZERO is stored. Field switching MRAM probably is the closest to an ideal “universal memory” since it is non-volatile
and fast and can be cycled indefinitely. Thus, it may be used as NVM as well as SRAM and DRAM. However, producing
a magnetic field in an IC circuit is both difficult and inefficient. Nevertheless, field switching MTJ MRAM has successfully
been made into products. The required magnetic field for switching, however, increases when the storage element scales
while electromigration limits the current density that can be used to produce higher H field. Therefore, it is expected that
field switch MTJ MRAM is unlikely to scale beyond 65nm node. Recent advances in “spin-transfer torque (STT)” approach
where a spin-polarized current transfers its angular momentum to the free magnetic layer and thus reverses its polarity
without resorting to an external magnetic field offer a new potential solution. During the spin transfer process, substantial
current passes through the MTJ tunnel layer and this stress may reduce the writing endurance. Upon further scaling the
stability of the storage element is subject to thermal noise, thus perpendicular magnetization materials are projected to be
needed at 32nm and below. Perpendicular magnetization has been recently demonstrated.
With rapid progress of NAND Flash and the recent introduction of 3D NAND that promises to continue the equivalent
scaling, the hope of STT-MRAM to replace NAND seems remote. However, its SRAM-like performance and much smaller
footprint than the conventional 6T-SRAM have gained much interest in that application, especially in mobile devices that
do not require high cycling endurance, as in computation. Therefore, STT-MRAM is now mostly considered not as a
standalone memory but an embedded memory [42][43], and is not tracked in the standalone NVM table. STT-MRAM
would be a potential solution not only for embedded SRAM replacement but also for embedded Flash (NOR) replacement.
This may be particularly interesting for IoT applications since low power is the most important. On the other hand, for other
embedded systems applications using higher memory density, NOR Flash is expected to continue to dominate since it is
still substantially more cost effective. Furthermore, Flash memory is well established for being able to endure the PCB
board soldering process (at ~ 250°C) without losing its preloaded code, which many emerging memories have not been able
to demonstrate yet.
5.3.3. PCRAM and Crosspoint Memory
PCRAM devices use the resistivity difference between the amorphous and the crystalline states of chalcogenide glass (the
most commonly used compound is Ge2Sb2Te5, or GST) to store the logic levels. The device consists of a top electrode, the
chalcogenide phase change layer, and a bottom electrode. The leakage path is cut off by an access transistor (or diode) in
series with the phase change element. The phase change write operation consists of: (1) RESET, for which the chalcogenide
glass is momentarily melted by a short electric pulse and then quickly quenched into amorphous solid with high resistivity,
and (2) SET, in which a lower amplitude but longer pulse (usually >100ns) anneals the amorphous phase into low resistance
crystalline state. The 1T-1R (or 1D-1R) cell is larger or smaller than NOR Flash, depending on whether MOSFET or BJT
(or diode) is used. The device may be programmed to any final state without erasing the previous state, thus providing
substantially faster programming throughput. The simple resistor structure and the low voltage operation also make
PCRAM attractive for embedded NVM applications. The major challenges for PCRAM are the high current (fraction of
mA) required to reset the phase change element, and the relatively long set time and high temperature tolerance to retain
the preloaded code during solder reflow (at ~250°C). Thermal disturb is a potential challenge for the scalability of PCRAM.
However, thermal disturb effect is non-cumulative (unlike Flash memory in which the program and read disturbs that cause
charge injection are cumulative) and the higher temperature RESET pulse is short (10ns). Interaction of phase change
material with electrodes may pose long-term reliability issues and limit the cycling endurance and is a major challenge for
DRAM-like applications. Like DRAM, PCRAM is a true random access, bit alterable memory.
The scalability of PCRAM device to < 5nm has been demonstrated using carbon nanotubes as electrodes [44], and the reset
current followed the extrapolation line from larger devices. In at least one case, cycling endurance of 1E11 was
demonstrated [45]. Phase change memory has been used in feature phones to replace NOR Flash since 2011, and has been
in volume production at ~45nm node since 2012, but no new product has been introduced since then. PCM memories have
been also targeted in the last years as potential candidates for eFlash replacement for embedded applications [46][47]. In
these works alloying of phase change materials of different classes allowed to obtain memory compliant to soldering reflow;
however, such high temperature stability has come at the expense of slower write speed.
Recently, a 3D cross point memory (3D XP) has been reported [48]. Details are still lacking but it is speculated that the
threshold switching ovonic threshold switching (OTS) property of chalcogenide-based phase change material constitutes
the core of the selector device responsible for the cross point cell, which was first reported in 2009 [49]. This is the first
commercial realization of the widely published storage class memory (SCM) [50][51]. Computer systems badly
needimproved I/O throughput and reduce power and cost, and it is a promising candidate to change the entire memory
hierarchy not only for high-end computation but for mobile systems as well. In addition, since the memory including the
selector device is completely fabricated in the BEOL process it is relatively inexpensive to stack multiple layers to reduce
bit cost.
3D cross point memory (3D XP) consists of a selector element made of ovonic threshold switching (OTS) (or an equivalent
device) in series with a storage element. The selector device has a high ON/OFF ratio and is at OFF state at all times except
when briefly turned on during writing or reading. The storage element is programmed to various logic states. Since the
selector is always off, with high resistance the memory array has no leakage issue even if all storage elements are at low
resistance state. During write or read operation the selector is temporarily turned on (by applying a voltage higher than its
threshold voltage) and the OTS characteristic suddenly reduces its resistance to a very low vaue, allowing reading (or
programming) current to be dominated by the resistance of the storage element. The storage element may be a phase-change
material and in that case the memory cell is a phase-change RAM (PCRAM) switched by OTS. The storage element may
also be a resistive memory material. Although bipolar operation makes the circuitry and operation more complicated, the
array structure is very similar to that using PCRAM.
PCRAM has the advantage of being unipolar in operation, is more product proven, and has high-cycling endurance.
ReRAM, on the other hand, promises higher temperature operation and in some cases faster switching. At this time, high-
density ReRAM is still in the development stage. Once developed, there seems little barrier prohibiting it from achieving
3D XP structure.
Figure MM-13 Schematic view of (a) 3D cross-point architecture using a vertical RRAM cell and (b) a vertical
MOSFET transistor as the bit-line selector to enable the random access capability of individual cells in the array [52].
6. POTENTIAL SOLUTIONS
Below are the potential solutions to address the scaling challenges that were addressed in section 0 towards the targets
described in section 1.2. Near-term (2020-2025) potential solutions are listed in Table MM-16 while long-term (2026-2034)
potential solutions are listed in Table MM-17.
Table MM-16 Potential Solutions—Near-term
Near-Term Potential Description
Solutions: 2020-2025
Performance • Increasing fin height to match performance
• Reduce interface contact resistance through new materials and wrap-around contact
• Introduce low-κ device spacer
• Reduce interconnect resistance through barrier and liner scaling
Power • Introduce GAA architectures
• Reduce device parasitics
Area and Cost • Adoption of EUV for single and double patterning
• DTCO enhancement
• Introduction of high-density emerging memory as cache applications
These potential solutions are mostly targeting improvement of the PPAC value of logic technologies. It should be noted
that emergence of application drivers such as 5G brings new potential solutions for the analog and RF enablement with the
use of those technology platforms. Examples include co-integration of III-V technologies with Si logic through layer
transfer and/or selective growth for the enablement of versatile radios in small form factor. Si technologies, developed on
low-loss SOI substrates, are expected to push the envelope of mm-wave communications where high transition frequency
(Ft) and low insertion loss will be traded with a relatively lower output power in comparison to non-Si counterparts.
Si photonics is gaining momentum in short-to-medium distance connectivity applications such as chip-to-chip
communications in data server racks and back-haul network of radio access cells. Those solutions require highly integrated
interposer incorporating optical modulators, laser source, photo diodes, photonic waveguides, wave-division-multiplexors,
and assembly interfaces coupling fiber to the waveguide. The requirements, challenges, and potential solutions are described
in the Outside System Connectivity roadmap report.
Another growing solution is the trend of miniaturizing personalized healthcare with the co-integration of heterogeneous
technologies. Those products are expected to co-integrate sensors, battery, high-endurance/high-speed non-volatile
memory, RF connectivity components, and ultra-low-power processing augmented with machine learning capability in the
same package. More Moore technologies are helping in this context to reduce the power consumption of those devices as
well as bringing new memories (e.g., MRAM, FeRAM) required for these applications.
7. CROSS TEAMS
Through cross-functional team interaction with other IFTs, the More Moore team incorporated valuable inputs in our
roadmap both in terms of requirements as well as technology capability limits:
• Systems and Architectures (SA) IFT—computational datapath/fabric such as number of CPU and GPU cores per
a given footprint as well as latency/bandwidth for data access
• Application Benchmarking (AB) IFT—performance and energy scaling targets, chip-level power (active, static,
sleep), thermal envelope
• Lithography IFT—Pitch limits of 193i and EUV lithography, CDU/LER capability, timeline of EUV in HVM
adoption
• Yield IFT—unit-step related defect impact on material quality, infrastructural constraints such as CD and defect
density on filtration and detection.
• Metrology IFT—Extendibility of metrology of 3D devices such as lateral-GAA and vertical-GAA
• Outside System Connectivity (OSC) IFT—I/O and integration requirements for 5G and high-speed memory for
data server
• Packaging Integration IFT—form factor and hetero-technology needs for mobile, 5G, and automotive
• Beyond CMOS (BC) IFT—3D memories such as RRAM and PCM, memristor for neuromorphic applications
9. REFERENCES
[1] J.-A. Carballo et al., “ITRS 2.0: towards a re-framing of the semiconductor technology roadmap”, Proc. ICCD, October 2014.
[2] W.-T. J. Chan, A. Kahng, S. Nath, and I. Yamamoto, “The ITRS MPU and SoC system drivers: calibration and implications for
design-based equivalent scaling in the roadmap,” Proc. IEEE Int. Computer Design (ICCD), pp. 153-160, October 2014.
[3] M. Badaroglu and J. Xu, “Interconnect-aware device targeting from PPA perspective”, ICCAD, November 2016.
[4] C. Auth et al., “A 10nm high performance and low-power CMOS technology featuring 3rd-generation finFET transistors, self-
aligned quad patterning, contact over active gate and Cobalt local interconnects,” IEDM, Session 2.9, December 2017.
[5] X. Wang et al., “Design-technology co-optimization of standard cell libraries on Intel 10nm process”, IEDM, Session 28.2,
December 2018.
[6] G. Yeap, et al., “5nm CMOS production technology platform featuring full-fledged EUV, and high mobility channel FinFETs with
densest 0.021 um2 SRAM cells for mobile SoC and high performance computing applications,” IEDM, Section 36.7, December
2019.
[7] S.-W. Wu, “A 7nm CMOS platform technology featuring 4th generation finFET transistors with a 0.027um2 high density 6-T
SRAM cell for mobile SoC applications”, IEDM, Session 2.6, December 2016.
[8] G. Bae et al., “3nm GAA technology featuring multi-bridge-channel FET for low power and high performance applications”,
IEDM, Session 28.7, December 2018.
[9] P. Batude et al., “Advances in 3D CMOS sequential integration”, IEDM, Section 14.1, p. 1-4, December 2009.
[10] M. Badaroglu et al., “PPAC scaling enablement for 5nm mobile SoC technology,” ESSDERC, September 2017.
[11] A. Veloso et al., “Challenges and opportunities of vertical FET devices using 3D circuit design layouts”, IEEE SOI-3D-
Subthreshold Microelectronics Technology Unified Conference (S3S), 2016.
[12] T. P. Ma, “Beyond Si: opportunities and challenges for CMOS technology based on high-mobility channel materials”, Sematech
Symposium Taiwan, September 2012.
[13] T. Skotnicki and F. Boeuf, “How can high mobility channel materials boost or degrade performance in advanced CMOS”, VLSI,
pp. 153-154, June 2010.
[14] K. Kuhn et al. “Past, present and future: SiGe and CMOS transistor scaling”, Electrochemical society trans., Vol. 33, No. 6, pp.
13-17, 2010.
[15] G. Eneman et al., “Stress simulations for optimal mobility group IV p- and nMOS finFETs for the 14nm node and beyond”, IEDM,
pp. 6.5.1-6.5.4, December 2012.
[16] R. Xie, “A 7nm finFET technology featuring EUV patterning and dual strained high mobility channels”, IEDM, Section 2.7,
December 2016.
[17] R. Berthelon et al., “A novel dual isolation scheme for stress and back bias maximum efficiency in FDSOI technology”, IEDM,
Section 17.7, December 2016.
[18] R. Carter et al., “22nm FDSOI technology for emerging mobile, internet-of-things, and RF applications”, IEDM, Section 2.2,
December 2016.
[19] K.-W. Ang et al., “Effective Schottky barrier height modulation using dielectric dipoles for source/drain specific contact resistivity
improvement”, IEDM, pp. 18.6.1-18.6.4, December 2012.
[20] O. Gluschenkov et al., “FinFET performance with Si:P and Ge:group-III-metal metastable contact trench alloys”, IEDM, December
2016.
[21] S.C Song et al., “Holistic technology optimization and key enablers for 7nm mobile SoC,” VLSI, pp. T198-T199, June 2015.
[22] K. Cheng et al., “Air spacer for 10nm finFET CMOS and beyond,” IEDM, December 2016.
[23] A. Keshavarzi et al, “Architecting advanced technologies for 14nm and beyond with 3D FinFET transistors for the future SoC
applications”, IEDM, pp. 4.1.1-4.1.4, December 2011.
[24] J. Mitard et al., “15nm-wfin high-performance low-defectivity strained-germanium pFinFETs with low temperature STI-last
process”, VLSI, pp. 1-2, June 2014.
[25] R. Xie et al., “A 7nm finFET technology featuring EUV patterning and dual strained high mobility channels”, IEDM, December
2016.
[26] G. Eneman et al., “Quantum barriers and ground-plane isolation: a path for scaling bulk-finFET technologies to the 7nm node and
beyond”, IEDM, pp. 12.3.1-12.3.4, December 2013.
[27] M.-G. Bardon et al., “Extreme scaling enabled by 5 tracks cells: Holistic design-device co-optimization for finFETs and lateral
nanowires”, IEDM, December 2016.
[28] A. Khakifirooz and D. A. Antoniadis, “Transistor performance scaling: The role of virtual source velocity and its mobility
dependence,” IEDM, pp. 667–670, December 2006.
[29] K. Jeong and A. Kahng, “A power-constrained MPU roadmap for the International Technology Roadmap for Semiconductors
(ITRS),” Proc. Int. SoC Design Conf. (ISOCC), pp. 49-52, March 2010.
[30] P. Batude et al., “GeOI and SOI 3D monolithic cell integrations for high density applications”, VLSI, A9-1, p.166-167, June 2009.
[31] I. Ouerghi et al., « High performance polysilicon nanowire NEMS for CMOS embedded nanosensors”, IEDM, Section 22.4, p. 1-
4, December 2014.
[32] P. Batude et al., “3-D sequential integration: a key enabling technology for heterogeneous co-integration of new function with
CMOS”, Journal on Emerging and Selected Topics in Circuits and Systems 2, p. 714-722, 2012.
[33] P. Coudrain et al., “Setting up 3D sequential integration for back-illuminated CMOS image sensors with highly miniaturized pixels
with low temperature fully depleted SOI transistors”, IEDM, December 2008.
[34] W. Rachmady, et al., “300mm heterogeneous 3D integration of record performance layer transfer germanium PMOS with silicon
NMOS for low power high performance logic applications”, IEDM, Section 29.7, December 2019.
[35] J. Y. Kim et al., "The breakthrough in data retention time of DRAM using recess-channel-array transistor (RCAT) for 88 nm feature
size and beyond", VLSI, p.11, June 2003.
[36] J. Y. Kim et al., "S-RCAT (sphere-shaped-recess-channel-array transistor) technology for 70nm DRAM feature size and beyond",
VLSI, p.34, June 2005.
[37] S.-W. Chung et al., "Highly scalable saddle-Fin (S-Fin) transistor for sub-50 nm DRAM technology", VLSI, p.32, June 2006.
[38] T. Schloesser et al., "6F2 buried wordline DRAM cell for 40 nm and beyond", IEDM, p. 809, December 2008.
[39] D.-S. Kil et al., “Development of new TiN/ZrO2/Al2O3/ZrO2/TiN capacitors extendable to 45nm generation DRAMs replacing
HfO2 based dielectrics”, VLSI, p.38, June 2006.
[40] M. Sung et al, “Gate-first high-k/metal gate DRAM technology for low power and high-performance products”, IEDM, December
2015.
[41] H. Tanaka et al., "Bit cost scalable technology with punch and plug process for ultra high-density flash memory", VLSI, pp. 14-
15, June 2007.
[42] Y. Lu et al., “Fully functional perpendicular STT-MRAM macro embedded in 40 nm logic for energy-efficient IoT applications”,
IEDM, pp. 660-663, December 2015.
[43] O. Golonzka, “MRAM as embedded non-volatile memory solution for 22FFL FinFET technology”, IEDM, Session 18.1,
December 2018.
[44] J. Liang et al., "A 1.4uA reset current phase change memory cell with integrated carbon nanotube electrodes for cross-point memory
application", VLSI, 5B-4, June 2011.
[45] I.S. Kim et al., "High-performance PRAM cell scalable to sub-20nm technology with below 4F2 cell Size, extendable to DRAM
applications", VLSI, 19-3, June 2010.
[46] V. Sousa et al., “Operation fundamentals in 12Mb phase change memory based on innovative Ge-rich GST materials featuring
high reliability performance”, VLSI, June 2015.
[47] W.-C. Chien et al., “Reliability study of a 128Mb phase change memory chip implemented with doped Ga-Sb-Ge with
extraordinary thermal stability”, IEDM, S21.1, December 2016.
[48] H. Castro, "Accessing memory cells in parallel in a cross-point array", Publication 2015/0074326 A1 US Patent Office, March 12,
2015.
[49] DC Kau et al., "A stackable cross point phase change memory", IEDM, pp. 617-620, December 2009.
[50] R. Freitas and W. Wilcke, "Storage class memory, the next storage system technology", 52(4/5), 439, IBM Journal of Research
and Development, 2008.
[51] G.W. Burr et al., "An overview of candidate device technologies for storage class memory", 52(4/5), 449, IBM Journal of Research
and Development, 2008.
[52] H.Y. Chen et al., "HfOx based vertical resistive random-access memory for cost-effective 3D cross-point architecture without cell
selector", IEDM, pp. 497-500, (20.7.1-20.7.4), December 2012.