General Description: Ultrascale Architecture and Product Data Sheet: Overview
General Description: Ultrascale Architecture and Product Data Sheet: Overview
General Description: Ultrascale Architecture and Product Data Sheet: Overview
General Description
Xilinx® UltraScale™ architecture comprises high-performance FPGA, MPSoC, and RFSoC families that address a vast spectrum of
system requirements with a focus on lowering total power consumption through numerous innovative technological
advancements.
Kintex® UltraScale FPGAs: High-performance FPGAs with a focus on price/performance, using both monolithic and
next-generation stacked silicon interconnect (SSI) technology. High DSP and block RAM-to-logic ratios and next-generation
transceivers, combined with low-cost packaging, enable an optimum blend of capability and cost.
Kintex UltraScale+™ FPGAs: Increased performance and on-chip UltraRAM memory to reduce BOM cost. The ideal mix of
high-performance peripherals and cost-effective system implementation. Kintex UltraScale+ FPGAs have numerous power
options that deliver the optimal balance between the required system performance and the smallest power envelope.
Virtex® UltraScale FPGAs: High-capacity, high-performance FPGAs enabled using both monolithic and next-generation SSI
technology. Virtex UltraScale devices achieve the highest system capacity, bandwidth, and performance to address key market and
application requirements through integration of various system-level functions.
Virtex UltraScale+ FPGAs: The highest transceiver bandwidth, highest DSP count, and highest on-chip and in-package memory
available in the UltraScale architecture. Virtex UltraScale+ FPGAs also provide numerous power options that deliver the optimal
balance between the required system performance and the smallest power envelope.
Zynq® UltraScale+ MPSoCs: Combine the Arm® v8-based Cortex®-A53 high-performance energy-efficient 64-bit application
processor with the Arm Cortex-R5F real-time processor and the UltraScale architecture to create the industry's first
programmable MPSoCs. Provide unprecedented power savings, heterogeneous processing, and programmable acceleration.
Zynq® UltraScale+ RFSoCs: Combine RF data converter subsystem and forward error correction with industry-leading
programmable logic and heterogeneous processing capability. Integrated RF-ADCs, RF-DACs, and soft-decision FECs (SD-FEC)
provide the key subsystems for multiband, multi-mode cellular radios and cable infrastructure.
Family Comparisons
Table 1: Device Resources
Kintex Kintex Virtex Virtex Zynq Zynq
UltraScale UltraScale+ UltraScale UltraScale+ UltraScale+ UltraScale+
FPGA FPGA FPGA FPGA MPSoC RFSoC
MPSoC Processing System ✓ ✓
RF-ADC/DAC ✓
SD-FEC ✓
System Logic Cells (K) 318–1,451 356–1,143 783–5,541 862–8,938 103–1,143 678–930
Max. Transceiver Speed (Gb/s) 16.3 32.75 30.5 58.0 32.75 32.75
Max. Serial Bandwidth (full duplex) (Gb/s) 2,086 3,268 5,616 8,384 3,268 1,048
Memory Interface Performance (Mb/s) 2,400 2,666 2,400 2,666 2,666 2,666
© Copyright 2013–2019 Xilinx, Inc. Xilinx, the Xilinx logo, Alveo, Artix, Kintex, Spartan, UltraScale, Versal, Virtex, Vivado, Zynq, and other designated brands included herein
are trademarks of Xilinx in the United States and other countries. AMBA, AMBA Designer, Arm, Arm1176JZ-S, CoreSight, Cortex, and PrimeCell are trademarks of Arm in
the EU and other countries. PCI, PCIe, and PCI Express are trademarks of PCI-SIG and used under license. All other trademarks are the property of their respective owners.
Summary of Features
RF Data Converter Subsystem Overview
Most Zynq UltraScale+ RFSoCs include an RF data converter subsystem, which contains multiple radio
frequency analog to digital converters (RF-ADCs) and multiple radio frequency digital to analog
converters (RF-DACs). The high-precision, high-speed, power efficient RF-ADCs and RF-DACs can be
individually configured for real data or can be configured in pairs for real and imaginary I/Q data.
To support the processors' functionality, a number of peripherals with dedicated functions are included in
the PS. For interfacing to external memories for data or configuration storage, the PS includes a
multi-protocol dynamic memory controller, a DMA controller, a NAND controller, an SD/eMMC controller
and a Quad SPI controller. In addition to interfacing to external memories, the APU also includes a Level-1
(L1) and Level-2 (L2) cache hierarchy; the RPU includes an L1 cache and Tightly Coupled memory
subsystem. Each has access to a 256KB on-chip memory.
For high-speed interfacing, the PS includes 4 channels of transmit (TX) and receive (RX) pairs of
transceivers, called PS-GTR transceivers, supporting data rates of up to 6.0Gb/s. These transceivers can
interface to the high-speed peripheral blocks that support PCIe at 5.0GT/s (Gen 2) as a root complex or
Endpoint in x1, x2, or x4 configurations; Serial-ATA (SATA) at 1.5Gb/s, 3.0Gb/s, or 6.0Gb/s data rates; and
up to two lanes of Display Port at 1.62Gb/s, 2.7Gb/s, or 5.4Gb/s data rates. The PS-GTR transceivers can
also interface to components over USB 3.0 and Serial Gigabit Media Independent Interface (SGMII).
For general connectivity, the PS includes: a pair of USB 2.0 controllers, which can be configured as host,
device, or On-The-Go (OTG); an I2C controller; a UART; and a CAN2.0B controller that conforms to
ISO11898-1. There are also four triple speed Ethernet MACs and 128 bits of GPIO, of which 78 bits are
available through the MIO and 96 through the EMIO.
High-bandwidth connectivity based on the Arm AMBA® AXI4 protocol connects the processing units with
the peripherals and provides interface between the PS and the programmable logic (PL).
Migrating Devices
UltraScale and UltraScale+ families provide footprint compatibility to enable users to migrate designs
from one device or family to another. Any two packages with the same footprint identifier code are
footprint compatible. For example, Kintex UltraScale devices in the A1156 packages are footprint
compatible with Kintex UltraScale+ devices in the A1156 packages. Likewise, Virtex UltraScale devices in
the B2104 packages are compatible with Virtex UltraScale+ devices and Kintex UltraScale devices in the
B2104 packages. All valid device/package combinations are provided in the Device-Package Combinations
and Maximum I/Os tables in this document. Refer to UG583, UltraScale Architecture PCB Design User Guide
for more detail on migrating between UltraScale and UltraScale+ devices and packages.
Notes:
1. Certain advanced configuration features are not supported in the KU025. Refer to the Configuring FPGAs section for details.
2. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
3. HR = High-range I/O with support for I/O voltage from 1.2V to 3.3V.
4. GTH transceivers in SF/FB packages support data rates up to 12.5Gb/s. See Table 4.
5. GTY transceivers in Kintex UltraScale devices support data rates up to 16.3Gb/s. See Table 4.
Notes:
1. Go to Ordering Information for package designation details.
2. FB/FF/FL packages have 1.0mm ball pitch. SF packages have 0.8mm ball pitch.
3. Packages with the same last letter and number sequence, e.g., A2104, are footprint compatible with all other UltraScale
architecture-based devices with the same sequence. The footprint compatible devices within this family are outlined. See the
UltraScale Architecture Product Selection Guide for details on inter-family migration.
4. GTY transceivers in Kintex UltraScale devices support data rates up to 16.3Gb/s.
5. GTH transceivers in SF/FB packages support data rates up to 12.5Gb/s.
Notes:
1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.
3. GTY transceiver line rates are package limited: SFVB784 to 12.5Gb/s; FFVA676, FFVD900, and FFVA1156 to 16.3Gb/s. See
Table 6.
Notes:
1. Go to Ordering Information for package designation details.
2. FF packages have 1.0mm ball pitch. SF packages have 0.8mm ball pitch.
3. GTY transceiver line rates are package limited: SFVB784 to 12.5Gb/s; FFVA676, FFVD900, and FFVA1156 to 16.3Gb/s.
4. Packages with the same last letter and number sequence, e.g., A676, are footprint compatible with all other UltraScale
architecture-based devices with the same sequence. The footprint compatible devices within this family are outlined. See
the UltraScale Architecture Product Selection Guide for details on inter-family migration.
Notes:
1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
2. HR = High-range I/O with support for I/O voltage from 1.2V to 3.3V.
Notes:
1. Go to Ordering Information for package designation details.
2. All packages have 1.0mm ball pitch.
3. Packages with the same last letter and number sequence, e.g., A2104, are footprint compatible with all other UltraScale
architecture-based devices with the same sequence. The footprint compatible devices within this family are outlined. See the
UltraScale Architecture Product Selection Guide for details on inter-family migration.
Notes:
1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.
3. GTY transceivers in the FLGF1924 package support data rates up to 16.3Gb/s. See Table 10.
4. This block operates in compatibility mode for 16.0GT/s (Gen4) operation. Go to PG213, UltraScale+ Devices Integrated Block for PCI Express Product Guide, for
details on compatibility mode.
Notes:
1. Go to Ordering Information for package designation details.
2. All packages have 1.0mm ball pitch.
3. Packages with the same last letter and number sequence, e.g., A2104, are footprint compatible with all other UltraScale architecture-based devices with the same sequence.
The footprint compatible devices within this family are outlined. See the UltraScale Architecture Product Selection Guide for details on inter-family migration.
4. Consult UG583, UltraScale Architecture PCB Design User Guide for specific migration details.
5. GTY transceivers in the FLGF1924 package support data rates up to 16.3Gb/s.
6. These 52.5x52.5mm overhang packages have the same PCB ball footprint as the corresponding 47.5x47.5mm packages (i.e., the same last letter and number sequence) and
are footprint compatible.
Notes:
1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
2. GTY transceivers in the FLGF1924 package support data rates up to 16.3Gb/s. See Table 12.
3. This block operates in compatibility mode for 16.0GT/s (Gen4) operation. Go to PG213, UltraScale+ Devices Integrated
Block for PCI Express Product Guide, for details on compatibility mode.
Notes:
1. Go to Ordering Information for package designation details.
2. All packages have 1.0mm ball pitch.
3. Packages with the same last letter and number sequence, e.g., A2104, are footprint compatible with all other UltraScale
architecture-based devices with the same sequence. The footprint compatible devices within this family are outlined. See the
UltraScale Architecture Product Selection Guide for details on inter-family migration.
4. Consult UG583, UltraScale Architecture PCB Design User Guide for specific migration details.
Real-Time Processing Unit Dual-core Arm Cortex-R5F with CoreSight; Single/Double Precision Floating Point;
32KB/32KB L1 Cache, and TCM
Embedded and External 256KB On-Chip Memory w/ECC; External DDR4; DDR3; DDR3L; LPDDR4; LPDDR3;
Memory External Quad-SPI; NAND; eMMC
214 PS I/O; UART; CAN; USB 2.0; I2C; SPI; 32b GPIO; Real Time Clock; WatchDog Timers; Triple
General Connectivity
Timer Counters
High-Speed Connectivity 4 PS-GTR; PCIe Gen1/2; Serial ATA 3.1; DisplayPort 1.2a; USB 3.0; SGMII
System Logic Cells 103,320 154,350 192,150 256,200 469,446 504,000 599,550
CLB Flip-Flops 94,464 141,120 175,680 234,240 429,208 460,800 548,160
CLB LUTs 47,232 70,560 87,840 117,120 214,604 230,400 274,080
Distributed RAM (Mb) 1.2 1.8 2.6 3.5 6.9 6.2 8.8
Block RAM Blocks 150 216 128 144 714 312 912
Block RAM (Mb) 5.3 7.6 4.5 5.1 25.1 11.0 32.1
UltraRAM Blocks 0 0 48 64 0 96 0
UltraRAM (Mb) 0 0 13.5 18.0 0 27.0 0
DSP Slices 240 360 728 1,248 1,973 1,728 2,520
CMTs 3 3 4 4 4 8 4
Max. HP I/O(1) 156 156 156 156 208 416 208
Max. HD I/O(2) 96 96 96 96 120 48 120
System Monitor 2 2 2 2 2 2 2
GTH Transceiver 16.3Gb/s(3) 0 0 16 16 24 24 24
GTY Transceivers 32.75Gb/s 0 0 0 0 0 0 0
Transceiver Fractional PLLs 0 0 8 8 12 12 12
PCIe Gen3 x16 0 0 2 2 0 2 0
150G Interlaken 0 0 0 0 0 0 0
100G Ethernet w/ RS-FEC 0 0 0 0 0 0 0
Notes:
1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.
3. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s. See Table 14.
Notes:
1. Go to Ordering Information for package designation details.
2. FB/FF packages have 1.0mm ball pitch. SB/SF packages have 0.8mm ball pitch.
3. All device package combinations bond out 4 PS-GTR transceivers.
4. All device package combinations bond out 214 PS I/O except ZU2CG and ZU3CG in the SBVA484 and SFVA625 packages,
which bond out 170 PS I/Os. Packages that bond out 170 PS I/O support DDR 32-bit only.
5. Packages with the same last letter and number sequence, e.g., A484, are footprint compatible with all other UltraScale
architecture-based devices with the same sequence. The footprint compatible devices within this family are outlined.
6. All 58 HP I/O pins are powered by the same VCCO supply.
7. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s.
Notes:
1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.
3. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s. See Table 16.
Notes:
1. Go to Ordering Information for package designation details.
2. FB/FF packages have 1.0mm ball pitch. SB/SF packages have 0.8mm ball pitch.
3. All device package combinations bond out 4 PS-GTR transceivers.
4. All device package combinations bond out 214 PS I/O except ZU2EG and ZU3EG in the SBVA484 and SFVA625 packages, which bond out 170 PS I/Os. Packages that
bond out 170 PS I/O support DDR 32-bit only.
5. Packages with the same last letter and number sequence, e.g., A484, are footprint compatible with all other UltraScale architecture-based devices with the same
sequence. The footprint compatible devices within this family are outlined.
6. All 58 HP I/O pins are powered by the same VCCO supply.
7. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s.
Application Processing Unit Quad-core Arm Cortex-A53 MPCore with CoreSight; NEON & Single/Double Precision Floating Point;
32KB/32KB L1 Cache, 1MB L2 Cache
Dual-core Arm Cortex-R5F with CoreSight; Single/Double Precision Floating Point;
Real-Time Processing Unit 32KB/32KB L1 Cache, and TCM
Embedded and External 256KB On-Chip Memory w/ECC; External DDR4; DDR3; DDR3L; LPDDR4; LPDDR3;
Memory External Quad-SPI; NAND; eMMC
General Connectivity 214 PS I/O; UART; CAN; USB 2.0; I2C; SPI; 32b GPIO; Real Time Clock; WatchDog Timers; Triple
Timer Counters
High-Speed Connectivity 4 PS-GTR; PCIe Gen1/2; Serial ATA 3.1; DisplayPort 1.2a; USB 3.0; SGMII
Graphic Processing Unit Arm Mali-400 MP2; 64KB L2 Cache
Video Codec 1 1 1
System Logic Cells 192,150 256,200 504,000
CLB Flip-Flops 175,680 234,240 460,800
CLB LUTs 87,840 117,120 230,400
Distributed RAM (Mb) 2.6 3.5 6.2
Block RAM Blocks 128 144 312
Block RAM (Mb) 4.5 5.1 11.0
UltraRAM Blocks 48 64 96
UltraRAM (Mb) 13.5 18.0 27.0
DSP Slices 728 1,248 1,728
CMTs 4 4 8
Max. HP I/O(1) 156 156 416
Max. HD I/O(2) 96 96 48
System Monitor 2 2 2
GTH Transceiver 16.3Gb/s(3) 16 16 24
GTY Transceivers 32.75Gb/s 0 0 0
Transceiver Fractional PLLs 8 8 12
PCIe Gen3 x16 2 2 2
150G Interlaken 0 0 0
100G Ethernet w/ RS-FEC 0 0 0
Notes:
1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.
3. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s. See Table 18.
Notes:
1. Go to Ordering Information for package designation details.
2. FB/FF packages have 1.0mm ball pitch. SF packages have 0.8mm ball pitch.
3. All device package combinations bond out 4 PS-GTR transceivers.
4. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s.
5. Packages with the same last letter and number sequence, e.g., B900, are footprint compatible with all other UltraScale
architecture-based devices with the same sequence. The footprint compatible devices within this family are outlined.
Table 20: Zynq UltraScale+ RFSoC Device-Package Combinations and Maximum I/Os
XCZU21DR XCZU25DR XCZU27DR XCZU28DR XCZU29DR XCZU39DR XCZU43DR XCZU46DR XCZU47DR XCZU48DR XCZU49DR
Package(1) Dimensions
PSIO, HDIO, HPIO,
PS-GTR, GTY, RF-ADC, RF-DAC
214, 48, 104 214, 48, 104 214, 48, 104 214, 48, 104 214, 48, 104 214, 48, 104
FFVE1156 35x35 4, 8, 8, 8 4, 8, 8, 8 4, 8, 8, 8 4, 8, 4, 4 4, 8, 8, 8 4, 8, 8, 8
214, 48, 104 214, 48, 104 214, 48, 104 214, 48, 104 214, 48, 104 214, 48, 104
FSVE1156 35x35 4, 8, 8, 8 4, 8, 8, 8 4, 8, 8, 8 4, 8, 4, 4 4, 8, 8, 8 4, 8, 8, 8
214, 48, 299 214, 48, 299 214, 48, 299 214, 48, 299 214, 48, 299 214, 48, 299
FFVG1517 40x40 4, 8, 8, 8 4, 16, 8, 8 4, 16, 8, 8 4, 16, 4, 4 4, 16, 8, 8 4, 16, 8, 8
214, 48, 299 214, 48, 299 214, 48, 299 214, 48, 299 214, 48, 299 214, 48, 299
FSVG1517 40x40 4, 8, 8, 8 4, 16, 8, 8 4, 16, 8, 8 4, 16, 4, 4 4, 16, 8, 8 4, 16, 8, 8
Notes:
1. Package(2)s with the same last letter and number sequence, e.g., B900, are footprint compatible with all other UltraScale architecture-based devices with the same sequence. The footprint compatible
devices within this family are outlined.
2. Of these 12 RF-ADCs, 8 can operate up to 2.5 GSPS and 4 can operate up to 5.0 GSPS.
Device Layout
UltraScale devices are arranged in a column-and-grid layout. Columns of resources are combined in
different ratios to provide the optimum capability for the device density, target market or application, and
device cost. At the core of Zynq UltraScale+ MPSoCs and RFSoCs is the processing system that displaces
some of the full or partial columns of programmable logic resources. Figure 1 shows a device-level view
with resources grouped together. For simplicity, certain resources such as the processing system,
integrated blocks for PCIe, configuration logic, and System Monitor are not shown.
X-Ref Target - Figure 1
Transceivers
DS890_01_101712
Clock
Region
Height
RF-ADCs
Each of the RF-ADCs can be configured individually for real input signals or as a pair for I/Q input signals.
The RF-ADC tile has one PLL and a clocking instance. Decimation filters in the RF-ADCs can operate in
varying decimation modes at 80% of Nyquist bandwidth with 89dB stop-band attenuation. Each RF-ADC
contains a 48-bit numerically controlled oscillator (NCO) and a dedicated high-speed, high-performance,
differential input buffer with on-chip calibrated 100Ω termination.
RF-DACs
Each of the RF-DACs can be configured individually for real outputs or as a pair for I/Q output signal
generation. The RF-DAC tile has one PLL and a clocking instance. Interpolation filters in the RF-DACs can
operate in varying interpolation modes at 80% of Nyquist bandwidth with 89dB stop-band attenuation.
Each RF-DAC contains a 48-bit NCO.
LDPC Decoding/Encoding
A range of quasi-cyclic codes can be configured over an AXI4-Lite interface. Code parameter memory can
be shared across up to 128 codes. Codes can be selected on a block-by-block basis with the encoder able
to reuse suitable decoder codes. The SD-FEC uses a normalized min-sum decoding algorithm with a
normalization factor programmable from 0.0625 to 1 in increments of 0.0625. There can be between 1 and
63 iterations for each codeword. Early termination is specified for each codeword to be none, one, or both
of the following:
Soft or hard outputs are specified for each codeword to include information and optional parity with 6-bit
soft log-likelihood ratio (LLR) on inputs and 8-bit LLR on outputs.
Turbo Decoding
In Turbo mode, the SD-FEC can use the Max, Max Scale, or Max Star algorithms. When using the Max Scale
algorithm, the scale factor is programmable from 0.0625 to 1 in increments of 0.0625. There can be
between 1 and 63 iterations for each codeword, specified using the AXI4-Stream control interface. Early
termination is specified for each codeword to be none, one, or both of the following:
• CRC passes
• No change in hard decision since last iteration
Soft or hard outputs are specified for each codeword to include systematic and optionally parity 0 and
parity 1 with 8-bit soft LLR on inputs and outputs.
There are four independently controllable power domains: the PL plus three within the PS (full power,
lower power, and battery power domains). Additionally, many peripherals support clock gating and power
gating to further reduce dynamic and static power consumption.
External Memory
The PS can interface to many types of external memories through dedicated memory controllers. The
dynamic memory controller supports DDR3, DDR3L, DDR4, LPDDR3, and LPDDR4 memories. The
multi-protocol DDR memory controller can be configured to access a 2GB address space in 32-bit
addressing mode and up to 32GB in 64-bit addressing mode using a single or dual rank configuration of
8-bit, 16-bit, or 32-bit DRAM memories. Both 32-bit and 64-bit bus access modes are protected by ECC
using extra bits.
The SD/eMMC controller supports 1 and 4 bit data interfaces at low, default, high-speed, and
ultra-high-speed (UHS) clock rates. This controller also supports 1-, 4-, or 8-bit-wide eMMC interfaces that
are compliant to the eMMC 4.51 specification. eMMC is one of the primary boot and configuration modes
for Zynq UltraScale+ MPSoCs and RFSoCs and supports boot from managed NAND devices. The controller
has a built-in DMA for enhanced performance.
The Quad-SPI controller is one of the primary boot and configuration devices. It supports 4-byte and
3-byte addressing modes. In both addressing modes, single, dual-stacked, and dual-parallel
configurations are supported. Single mode supports a quad serial NOR flash memory, while in double
stacked and double parallel modes, it supports two quad serial NOR flash memories.
The NAND controller is based on ONFI3.1 specification. It has an 8-pin interface and provides 200Mb/s of
bandwidth in synchronous mode. It supports 24 bits of ECC thus enabling support for SLC NAND
memories. It has two chip-selects to support deeper memory and a built-in DMA for enhanced
performance.
General Connectivity
There are many peripherals in the PS for connecting to external devices over industry standard protocols,
including CAN2.0B, USB, Ethernet, I2C, and UART. Many of the peripherals support clock gating and power
gating modes to reduce dynamic and static power consumption.
USB 3.0/2.0
The pair of USB controllers can be configured as host, device, or On-The-Go (OTG). The core is compliant
to USB 3.0 specification and supports super, high, full, and low speed modes in all configurations. In host
mode, the USB controller is compliant with the Intel XHCI specification. In device mode, it supports up to
12 end points. While operating in USB 3.0 mode, the controller uses the serial transceiver and operates up
to 5.0Gb/s. In USB 2.0 mode, the Universal Low Peripheral Interface (ULPI) is used to connect the controller
to an external PHY operating up to 480Mb/s. The ULPI is also connected in USB 3.0 mode to support
high-speed operations.
Ethernet MAC
The four tri-speed ethernet MACs support 10Mb/s, 100Mb/s, and 1Gb/s operations. The MACs support
jumbo frames and time stamping through the interfaces based on IEEE Std 1588v2. The ethernet MACs can
be connected through the serial transceivers (SGMII), the MIO (RGMII), or through EMIO (GMII). The GMII
interface can be converted to a different interface within the PL.
High-Speed Connectivity
The PS includes four PS-GTR transceivers (transmit and receive), supporting data rates up to 6.0Gb/s and
can interface to the peripherals for communication over PCIe, SATA, USB 3.0, SGMII, and DisplayPort.
PCIe
The integrated block for PCIe is compliant with PCI Express base specification 2.1 and supports x1, x2, and
x4 configurations as root complex or end point, compliant to transaction ordering rules in both
configurations. It has built-in DMA, supports one virtual channel and provides fully configurable base
address registers.
SATA
Users can connect up to two external devices using the two SATA host port interfaces compliant to the
SATA 3.1 specification. The SATA interfaces can operate at 1.5Gb/s, 3.0Gb/s, or 6.0Gb/s data rates and are
compliant with advanced host controller interface (AHCI) version 1.3 supporting partial and slumber
power modes.
DisplayPort
The DisplayPort controller supports up to two lanes of source-only DisplayPort compliant with VESA
DisplayPort v1.2a specification (source only) at 1.62Gb/s, 2.7Gb/s, and 5.4Gb/s data rates. The controller
supports single stream transport (SST); video resolution up to 4Kx2K at a 30Hz frame rate; video formats
Y-only, YCbCr444, YCbCr422, YCbCr420, RGB, YUV444, YUV422, xvYCC, and pixel color depth of 6, 8, 10,
and 12 bits per color component.
Input/Output
All UltraScale devices, whether FPGA, MPSoC, or RFSoCs, have I/O pins for communicating to external
components. In addition, in the PS, there are another 78 I/Os that the I/O peripherals use to communicate
to external components, referred to as multiplexed I/O (MIO). If more than 78 pins are required by the I/O
peripherals, the I/O pins in the PL can be used to extend the MPSoC or RFSoC interfacing capability,
referred to as extended MIO (EMIO).
The number of I/O pins in UltraScale FPGAs and in the programmable logic of Zynq UltraScale+ MPSoCs
and RFSoCs varies depending on device and package. Each I/O is configurable and can comply with a large
number of I/O standards. The I/Os are classed as high-range (HR), high-performance (HP), or high-density
(HD). The HR I/Os offer the widest range of voltage support, from 1.2V to 3.3V. The HP I/Os are optimized
for highest performance operation, from 1.0V to 1.8V. The HD I/Os are reduced-feature I/Os organized in
banks of 24, providing voltage support from 1.2V to 3.3V.
All I/O pins are organized in banks, with 52 HP or HR pins per bank or 24 HD pins per bank. Each bank has
one common VCCO output buffer power supply, which also powers certain input buffers. In addition, HR
banks can be split into two half-banks, each with their own VCCO supply. Some single-ended input buffers
require an internally generated or an externally applied reference voltage (VREF). VREF pins can be driven
directly from the PCB or internally generated using the internal VREF generator circuitry present in each
bank.
Most signal pin pairs can be configured as differential input pairs or output pairs. Differential input pin
pairs can optionally be terminated with a 100Ω internal resistor. All UltraScale devices support differential
standards beyond LVDS, including RSDS, BLVDS, differential SSTL, and differential HSTL. Each of the I/Os
supports memory I/O standards, such as single-ended and differential HSTL as well as single-ended and
differential SSTL. UltraScale+ families add support for MIPI with a dedicated D-PHY in the I/O bank.
I/O Logic
Input and Output Delay
All inputs and outputs can be configured as either combinatorial or registered. Double data rate (DDR) is
supported by all inputs and outputs. Any input or output can be individually delayed by up to 1,250ps of
delay with a resolution of 5–15ps. Such delays are implemented as IDELAY and ODELAY. The number of
delay steps can be set by configuration and can also be incremented or decremented while in use. The
IDELAY and ODELAY can be cascaded together to double the amount of delay in a single direction.
GTH/GTY Transceivers
The serial transmitter and receiver are independent circuits that use an advanced phase-locked loop (PLL)
architecture to multiply the reference frequency input by certain programmable numbers between 4 and
25 to become the bit-serial data clock. Each transceiver has a large number of user-definable features and
parameters. All of these can be defined during device configuration, and many can also be modified
during operation.
Transmitter (GTH/GTY)
The transmitter is fundamentally a parallel-to-serial converter with a conversion ratio of 16, 20, 32, 40, 64,
or 80 for the GTH and 16, 20, 32, 40, 64, 80, 128, or 160 for the GTY. This allows the designer to trade off
datapath width against timing margin in high-performance designs. These transmitter outputs drive the
PC board with a single-channel differential output signal. TXOUTCLK is the appropriately divided serial
data clock and can be used directly to register the parallel data coming from the internal logic. The
incoming parallel data is fed through an optional FIFO and has additional hardware support for the
8B/10B, 64B/66B, or 64B/67B encoding schemes to provide a sufficient number of transitions. The
bit-serial output signal drives two package pins with differential signals. This output signal pair has
programmable signal swing as well as programmable pre- and post-emphasis to compensate for PC board
losses and other interconnect characteristics. For shorter channels, the swing can be reduced to reduce
power consumption.
Receiver (GTH/GTY)
The receiver is fundamentally a serial-to-parallel converter, changing the incoming bit-serial differential
signal into a parallel stream of words, each 16, 20, 32, 40, 64, or 80 bits in the GTH or 16, 20, 32, 40, 64, 80,
128, or 160 for the GTY. This allows the designer to trade off internal datapath width against logic timing
margin. The receiver takes the incoming differential data stream, feeds it through programmable DC
automatic gain control, linear and decision feedback equalizers (to compensate for PC board, cable,
optical and other interconnect characteristics), and uses the reference clock input to initiate clock
recognition. There is no need for a separate clock line. The data pattern uses non-return-to-zero (NRZ)
encoding and optionally ensures sufficient data transitions by using the selected encoding scheme.
Parallel data is then transferred into the device logic using the RXUSRCLK clock. For short channels, the
transceivers offer a special low-power mode (LPM) to reduce power consumption by approximately 30%.
The receiver DC automatic gain control and linear and decision feedback equalizers can optionally
“auto-adapt” to automatically learn and compensate for different interconnect characteristics. This
enables even more margin for 10G+ and 25G+ backplanes.
Out-of-Band Signaling
The transceivers provide out-of-band (OOB) signaling, often used to send low-speed signals from the
transmitter to the receiver while high-speed serial data transmission is not active. This is typically done
when the link is in a powered-down state or has not yet been initialized. This benefits PCIe and SATA/SAS
and QPI applications.
GTM Transceivers
The serial transmitter and receiver are independent circuits that use an advanced phase-locked loop (PLL)
architecture to multiply the reference frequency input by certain programmable numbers between 16 and
160 to become the bit-serial data clock. Each transceiver has a large number of user-definable features
and parameters. All of these can be defined during device configuration, and many can also be modified
during operation.
Transmitter (GTM)
The transmitter is fundamentally a parallel-to-serial converter. These transmitter outputs drive pulse
amplitude modulated signals with either 4 levels (PAM4) or 2 levels (NRZ) to the PC board with a
single-channel differential output signal. TXOUTCLK is the appropriately divided serial data clock and can
be used directly to register the parallel data coming from the internal logic. The incoming parallel data can
optionally leverage a Reed-Solomon, RS(544,514) Forward Error Correction encoder and/or 64b66b data
encoder. The bit-serial output signal drives two package pins with PAM4 differential signals. This output
signal pair has programmable signal swing as well as programmable pre- and post-emphasis to
compensate for PC board losses and other interconnect characteristics. For shorter channels, the swing
can be reduced to reduce power consumption.
Receiver (GTM)
The receiver is fundamentally a serial-to-parallel converter, changing the incoming PAM4 differential
signal into a parallel stream of words. The receiver takes the incoming differential data stream, feeds it
through automatic gain compensation (AGC) and a continuous time linear equalizer (CTLE), after which it
is sampled with a high-speed analog to digital converter. Further equalization is completed digitally via a
decision feedback equalizer (DFE) and feed forward equalizer (FFE) implemented in DSP logic before the
recovered bits are parallelized and provided to the PCS. This equalization provides the flexibility to receive
data over channels ranging from very short chip-to-chip to high loss backplane applications across all
supported rates. Clock recovery circuitry generates a clock derived from the high-speed PLL to clock in
serial data and provides an appropriately divided and phase-aligned clock, RXOUTCLK, to internal logic.
Parallel data can optionally be transferred into an RS-FEC and/or 64b/66b decoder before being presented
to the FPGA interface.
UltraScale+ devices use two types of integrated blocks: PCIE4 and PCIE4C, with most using the PCIE4
blocks. PCIE4 blocks are compliant to PCI Express Base Specification v3.1 and support up to Gen3 x16, and
can also be configured for lower link width and speeds. The PCIE4 block does not support Gen4 operation.
Some devices, such as Virtex UltraScale+ HBM FPGAs, have only PCIE4C blocks or a combination of both
PCIE4 and PCIE4C blocks. The PCIE4C block can implement both PCI Express and CCIX while PCIE4 blocks
can implement only PCI Express.
PCIE4C blocks are compliant to the PCI Express Base Specification v3.1 supporting up to 8.0GT/s (Gen3)
and compatible with PCI Express Base Specification v4.0 supporting up to 16.0GT/s (Gen4). PCIE4C blocks
are also compliant with CCIX Base Specification v1.0 Version 0.9, supporting speeds up to 16.0GT/s.
PCIE4C blocks support up to 16 lanes at Gen3 or up to 8 lanes at Gen4 and can be configured for lower link
widths and speeds to conserve resources and power.
All integrated blocks for PCIe in the UltraScale architecture can be configured as Endpoint or Root Port.
The Root Port can be used to build the basis for a compatible Root Complex, to allow custom chip-to-chip
communication via the PCI Express protocol, and to attach ASSP Endpoint devices, such as Ethernet
Controllers or Fibre Channel HBAs, to the FPGA, MPSoC, or RFSoC.
The maximum lane widths and data rates per family are listed in Table 22.
Notes:
1. Transceivers in UltraScale+ devices support 16.0GT/s. Soft PCIe IP is available from Xilinx partners.
For high-performance applications, advanced buffering techniques of the block offer a flexible maximum
payload size of up to 1,024 bytes. The integrated block interfaces to the integrated high-speed
transceivers for serial connectivity and to block RAMs for data buffering. Combined, these elements
implement the Physical Layer, Data Link Layer, and Transaction Layer of the PCI Express protocol.
Xilinx provides LogiCORE™ IP options to configure the integrated blocks for PCIe in all UltraScale and
UltraScale+ devices. This includes AXI Streaming interfaces at the PCIe packet level and more advanced IP
such as AXI to PCIe Bridges and DMA engines. This IP gives the designer control over many configurable
parameters such as link width and speed, maximum payload size, and reference clock frequency. For a
complete list of features that can be configured for each of the IP, go to the specific Product Guide.
Virtex UltraScale+ HBM devices support CCIX data rates up to 16Gb/s and contain four CCIX ports and at
least four integrated blocks for PCIe. Each CCIX port requires the use of one integrated block for PCIe. If
not used with a CCIX port, the integrated blocks for PCIe can still be used for PCIe communication.
In UltraScale+ devices, the 100G Ethernet blocks contain a Reed Solomon Forward Error Correction
(RS-FEC) block, compliant to IEEE Std 802.3bj, that can be used with the Ethernet block or stand alone in
user applications. These families also support OTN mapping mode in which the PCS can be operated
without using the MAC.
# SLRs 2 2 2 3 3 3 2 2 3 3 4 1 1 2 3
SLR Width
6 6 6 6 6 9 6 6 6 8 8 8 8 8 8
(in regions)
SLR Height
5 5 5 5 5 5 5 5 5 4 4 4 4 4 4
(in regions)
Clock Management
The clock generation and distribution components in UltraScale devices are located adjacent to the
columns that contain the memory interface and input and output circuitry. This tight coupling of clocking
and I/O provides low-latency clocking to the I/O for memory interfaces and other I/O protocols. Within
every clock management tile (CMT) resides one mixed-mode clock manager (MMCM), two PLLs, clock
distribution buffers and routing, and dedicated circuitry for implementing external memory interfaces.
There are three sets of programmable frequency dividers (D, M, and O) that are programmable by
configuration and during normal operation via the Dynamic Reconfiguration Port (DRP). The pre-divider D
reduces the input frequency and feeds one input of the phase/frequency comparator. The feedback
divider M acts as a multiplier because it divides the VCO output frequency before feeding the other input
of the phase comparator. D and M must be chosen appropriately to keep the VCO within its specified
frequency range. The VCO has eight equally-spaced output phases (0°, 45°, 90°, 135°, 180°, 225°, 270°, and
315°). Each phase can be selected to drive one of the output dividers, and each divider is programmable
by configuration to divide by any integer from 1 to 128.
The MMCM has three input-jitter filter options: low bandwidth, high bandwidth, or optimized mode.
Low-Bandwidth mode has the best jitter attenuation. High-Bandwidth mode has the best phase offset.
Optimized mode allows the tools to find the best setting.
The MMCM can have a fractional counter in either the feedback path (acting as a multiplier) or in one
output path. Fractional counters allow non-integer increments of 1/8 and can thus increase frequency
synthesis capabilities by a factor of 8. The MMCM can also provide fixed or dynamic phase shift in small
increments that depend on the VCO frequency. At 1,600MHz, the phase-shift timing increment is 11.2ps.
PLL
With fewer features than the MMCM, the two PLLs in a clock management tile are primarily present to
provide the necessary clocks to the dedicated memory interface circuitry. The circuit at the center of the
PLLs is similar to the MMCM, with PFD feeding a VCO and programmable M, D, and O counters. There are
two divided outputs to the device fabric per PLL as well as one clock plus one enable signal to the memory
interface circuitry.
Zynq UltraScale+ MPSoCs and RFSoCs are equipped with five additional PLLs in the PS for independently
configuring the four primary clock domains with the PS: the APU, the RPU, the DDR controller, and the I/O
peripherals.
Clock Distribution
Clocks are distributed throughout UltraScale devices via buffers that drive a number of vertical and
horizontal tracks. There are 24 horizontal clock routes per clock region and 24 vertical clock routes per
clock region with 24 additional vertical clock routes adjacent to the MMCM and PLL. Within a clock region,
clock signals are routed to the device logic (CLBs, etc.) via 16 gateable leaf clocks.
Several types of clock buffers are available. The BUFGCE and BUFCE_LEAF buffers provide clock gating at
the global and leaf levels, respectively. BUFGCTRL provides glitchless clock muxing and gating capability.
BUFGCE_DIV has clock gating capability and can divide a clock by 1 to 8. BUFG_GT performs clock division
from 1 to 8 for the transceiver clocks. In MPSoCs and RFSoCs, clocks can be transferred from the PS to the
PL using dedicated buffers.
Memory Interfaces
Memory interface data rates continue to increase, driving the need for dedicated circuitry that enables
high performance, reliable interfacing to current and next-generation memory technologies. Every
UltraScale device includes dedicated physical interfaces (PHY) blocks located between the CMT and I/O
columns that support implementation of high-performance PHY blocks to external memories such as
DDR4, DDR3, QDRII+, and RLDRAM3. The PHY blocks in each I/O bank generate the address/control and
data bus signaling protocols as well as the precision clock/data alignment required to reliably
communicate with a variety of high-performance memory standards. Multiple I/O banks can be used to
create wider memory interfaces.
As well as external parallel memory interfaces, UltraScale architecture-based devices can communicate to
external serial memories, such as Hybrid Memory Cube (HMC), via the high-speed serial transceivers. All
transceivers in the UltraScale architecture support the HMC protocol, up to 15Gb/s line rates. UltraScale
devices support the highest bandwidth HMC configuration of 64 lanes with a single FPGA.
Block RAM
Every UltraScale architecture-based device contains a number of 36 Kb block RAMs, each with two
completely independent ports that share only the stored data. Each block RAM can be configured as one
36Kb RAM or two independent 18Kb RAMs. Each memory access, read or write, is controlled by the clock.
Connections in every block RAM column enable signals to be cascaded between vertically adjacent block
RAMs, providing an easy method to create large, fast memory arrays, and FIFOs with greatly reduced
power consumption.
All inputs, data, address, clock enables, and write enables are registered. The input address is always
clocked (unless address latching is turned off), retaining data until the next operation. An optional output
data pipeline register allows higher clock rates at the cost of an extra cycle of latency. During a write
operation, the data output can reflect either the previously stored data or the newly written data, or it can
remain unchanged. Block RAM sites that remain unused in the user design are automatically powered
down to reduce total power consumption. There is an additional pin on every block RAM to control the
dynamic power gating feature.
FIFO Controller
Each block RAM can be configured as a 36Kb FIFO or an 18Kb FIFO. The built-in FIFO controller for
single-clock (synchronous) or dual-clock (asynchronous or multirate) operation increments the internal
addresses and provides four handshaking flags: full, empty, programmable full, and programmable empty.
The programmable flags allow the user to specify the FIFO counter values that make these flags go active.
The FIFO width and depth are programmable with support for different read port and write port widths on
a single FIFO. A dedicated cascade path allows for easy creation of deeper FIFOs.
UltraRAM
UltraRAM is a high-density, dual-port, synchronous memory block available in UltraScale+ devices. Both
of the ports share the same clock and can address all of the 4K x 72 bits. Each port can independently read
from or write to the memory array. UltraRAM supports two types of write enable schemes. The first mode
is consistent with the block RAM byte write enable mode. The second mode allows gating the data and
parity byte writes separately. UltraRAM blocks can be connected together to create larger memory arrays.
Dedicated routing in the UltraRAM column enables the entire column height to be connected together. If
additional density is required, all the UltraRAM columns in an SLR can be connected together with a few
fabric resources to create single instances of RAM approximately 100Mb in size. This makes UltraRAM an
ideal solution for replacing external memories such as SRAM. Cascadable anywhere from 288Kb to 100Mb,
UltraRAM provides the flexibility to fulfill many different memory requirements.
The FPGA has 32 HBM AXI interfaces used to communicate with the HBM. Through a built-in switch
mechanism, any of the 32 HBM AXI interfaces can access any memory address on either one or both of the
HBM stacks due to the flexible addressing feature. This flexible connection between the FPGA and the
HBM stacks results in easy floorplanning and timing closure. The memory controllers perform read and
write reordering to improve bus efficiency. Data integrity is ensured through error checking and correction
(ECC) circuitry.
Each CLB contains one slice. There are two types of slices: SLICEL and SLICEM. LUTs in the SLICEM can be
configured as 64-bit RAM, as 32-bit shift registers (SRL32), or as two SRL16s. CLBs in the UltraScale
architecture have increased routing and connectivity compared to CLBs in previous-generation Xilinx
devices. They also have additional control signals to enable superior register packing, resulting in overall
higher device utilization.
Interconnect
Various length vertical and horizontal routing resources in the UltraScale architecture that span 1, 2, 4, 5,
12, or 16 CLBs ensure that all signals can be transported from source to destination with ease, providing
support for the next generation of wide data buses to be routed across even the highest capacity devices
while simultaneously improving quality of results and software run time.
Each DSP slice fundamentally consists of a dedicated 27 × 18 bit twos complement multiplier and a 48-bit
accumulator. The multiplier can be dynamically bypassed, and two 48-bit inputs can feed a
single-instruction-multiple-data (SIMD) arithmetic unit (dual 24-bit add/subtract/accumulate or quad
12-bit add/subtract/accumulate), or a logic unit that can generate any one of ten different logic functions
of the two operands.
The DSP includes an additional pre-adder, typically used in symmetrical filters. This pre-adder improves
performance in densely packed designs and reduces the DSP slice count by up to 50%. The 96-bit-wide
XOR function, programmable to 12, 24, 48, or 96-bit widths, enables performance improvements when
implementing forward error correction and cyclic redundancy checking algorithms.
The DSP also includes a 48-bit-wide pattern detector that can be used for convergent or symmetric
rounding. The pattern detector is also capable of implementing 96-bit-wide logic functions when used in
conjunction with the logic unit.
The DSP slice provides extensive pipelining and extension capabilities that enhance the speed and
efficiency of many applications beyond digital signal processing, such as wide dynamic bus shifters,
memory address generators, wide bus multiplexers, and memory-mapped I/O register files. The
accumulator can also be used as a synchronous up/down counter.
System Monitor
The System Monitor blocks in the UltraScale architecture are used to enhance the overall safety, security,
and reliability of the system by monitoring the physical environment via on-chip power supply and
temperature sensors and external channels to the ADC.
All UltraScale architecture-based devices contain at least one System Monitor. The System Monitor in
UltraScale+ FPGAs and the PL of Zynq UltraScale+ MPSoCs and RFSoCs is similar to the Kintex UltraScale
and Virtex UltraScale devices but with additional features including a PMBus interface.
Zynq UltraScale+ MPSoCs contain an additional System Monitor block in the PS. See Table 24.
Table 24: Key System Monitor Features
Kintex UltraScale+
Kintex UltraScale Virtex UltraScale+
Virtex UltraScale Zynq UltraScale+ PL Zynq UltraScale+ PS
ADC 10-bit 200kSPS 10-bit 200kSPS 10-bit 1MSPS
Interfaces JTAG, I2C, DRP JTAG, I2C, DRP, PMBus APB
In FPGAs and the PL of the MPSoCs and RFSoCs, sensor outputs and up to 17 user-allocated external
analog inputs are digitized using a 10-bit 200 kilo-sample-per-second (kSPS) ADC, and the measurements
are stored in registers that can be accessed via internal FPGA (DRP), JTAG, PMBus, or I2C interfaces. The
I2C interface and PMBus allow the on-chip monitoring to be easily accessed by the System Manager/Host
before and after device configuration.
The System Monitor in the PS MPSoC and RFSoC uses a 10-bit, 1 mega-sample-per-second (MSPS) ADC to
digitize the sensor outputs. The measurements are stored in registers and are accessed via the Advanced
Peripheral Bus (APB) interface by the processors and the platform management unit (PMU) in the PS.
Configuration
The UltraScale architecture-based devices store their customized configuration in SRAM-type internal
latches. The configuration storage is volatile and must be reloaded whenever the device is powered up.
This storage can also be reloaded at any time. Several methods and data formats for loading configuration
are available, determined by the mode pins, with more dedicated configuration datapath pins to simplify
the configuration process.
UltraScale architecture-based devices support secure and non-secure boot with optional Advanced
Encryption Standard - Galois/Counter Mode (AES-GCM) decryption and authentication logic. If only
authentication is required, the UltraScale architecture provides an alternative form of authentication in the
form of RSA algorithms. For RSA authentication support in the Kintex UltraScale and Virtex UltraScale
families, go to UG570, UltraScale Architecture Configuration User Guide.
UltraScale architecture-based devices also have the ability to select between multiple configurations, and
support robust field-update methodologies. This is especially useful for updates to a design after the end
product has been shipped. Designers can release their product with an early version of the design, thus
getting their product to market faster. This feature allows designers to keep their customers current with
the most up-to-date design while the product is already deployed in the field.
Upon reset, the device mode pins are read to determine the primary boot device to be used: NAND,
Quad-SPI, SD, eMMC, or JTAG. JTAG can only be used as a non-secure boot source and is intended for
debugging purposes. One of the CPUs, Cortex-A53 or Cortex-R5F, executes code out of on-chip ROM and
copies the first stage boot loader (FSBL) from the boot device to the on-chip memory (OCM).
After copying the FSBL to OCM, the processor executes the FSBL. Xilinx supplies example FSBLs or users
can create their own. The FSBL initiates the boot of the PS and can load and configure the PL, or
configuration of the PL can be deferred to a later stage. The FSBL typically loads either a user application
or an optional second stage boot loader (SSBL) such as U-Boot. Users obtain example SSBL from Xilinx or
a third party, or they can create their own SSBL. The SSBL continues the boot process by loading code from
any of the primary boot devices or from other sources such as USB, Ethernet, etc. If the FSBL did not
configure the PL, the SSBL can do so, or again, the configuration can be deferred to a later stage.
The static memory interface controller (NAND, eMMC, or Quad-SPI) is configured using default settings.
To improve device configuration speed, these settings can be modified by information provided in the
boot image header. The ROM boot image is not user readable or executable after boot.
Configuring FPGAs
The SPI (serial NOR) interface (x1, x2, x4, and dual x4 modes) and the BPI (parallel NOR) interface (x8 and
x16 modes) are two common methods used for configuring the FPGA. Users can directly connect an SPI or
BPI flash to the FPGA, and the FPGA's internal configuration logic reads the bitstream out of the flash and
configures itself, eliminating the need for an external controller. The FPGA automatically detects the bus
width on the fly, eliminating the need for any external controls or switches. Bus widths supported are x1,
x2, x4, and dual x4 for SPI, and x8 and x16 for BPI. The larger bus widths increase configuration speed and
reduce the amount of time it takes for the FPGA to start up after power-on.
In master mode, the FPGA can drive the configuration clock from an internally generated clock, or for
higher speed configuration, the FPGA can use an external configuration clock source. This allows
high-speed configuration with the ease of use characteristic of master mode. Slave modes up to 32 bits
wide that are especially useful for processor-driven configuration are also supported by the FPGA. In
addition, the new media configuration access port (MCAP) provides a direct connection between the
integrated block for PCIe and the configuration logic to simplify configuration over PCIe.
SEU detection and mitigation (SEM) IP, RSA authentication, post-configuration CRC, and Security Monitor
(SecMon) IP are not supported in the KU025 FPGA.
Packaging
The UltraScale devices are available in a variety of organic flip-chip and lidless flip-chip packages
supporting different quantities of I/Os and transceivers. Maximum supported performance can depend on
the style of package and its material. Always refer to the specific device data sheet for performance
specifications by package type.
In flip-chip packages, the silicon device is attached to the package substrate using a high-performance
flip-chip process. Decoupling capacitors are mounted on the package substrate to optimize signal
integrity under simultaneous switching of outputs (SSO) conditions.
Ordering Information
Table 25 shows the speed and temperature grades available in the different device families. VCCINT supply
voltage is listed in parentheses.
Zynq ZU19EG
UltraScale+ -3E (0.90V)
-2E (0.85V) -2I (0.85V)
EV
-2LE(2)(3) (0.85V or 0.72V)
Devices
-1E (0.85V) -1I (0.85V)
-1LI(3) (0.85V or 0.72V)
ZU21DR -2E (0.85V) -2I (0.85V)
ZU25DR -2LE(2)(3) (0.85V or 0.72V) -2LI (0.72V)(4)
ZU27DR
ZU28DR -1E (0.85V) -1I (0.85V)
ZU29DR -1LI(3) (0.85V or 0.72V)
-2I (0.85V)
ZU39DR
-2LI (0.72V)(4)
ZU43DR -2E (0.85V) -2I (0.85V)
ZU46DR -2LI (0.72V)(4)
ZU47DR
ZU48DR -1E (0.85V) -1I (0.85V)
ZU49DR -1LI(3) (0.72V)
Notes:
1. KU025 and KU095 are not available in -3E or -1LI speed/temperature grades.
2. In -2LE speed/temperature grade, devices can operate for a limited time with junction temperature of 110°C. Timing
parameters adhere to the same speed file at 110°C as they do below 110°C, regardless of operating voltage (nominal at
0.85V or low voltage at 0.72V). Operation at 110°C Tj is limited to 1% of the device lifetime and can occur sequentially
or at regular intervals as long as the total time does not exceed 1% of device lifetime.
3. In Zynq UltraScale+ MPSoCs and RFSoCs, when operating the PL at low voltage (0.72V), the PS operates at nominal
voltage (0.85V).
4. In -2LI speed/temperature grade, devices can operate for a limited time with junction temperature of 110°C. Timing
parameters adhere to the same speed file at 110°C as they do below 110°C. Operation at 110°C Tj is limited to 5% of the
device lifetime and can occur sequentially or at regular intervals as long as the total time does not exceed 5% of device
lifetime.
The ordering information shown in Figure 3 applies to all packages in the Kintex UltraScale and Virtex
UltraScale FPGAs. Refer to the Package Marking section of UG575, UltraScale and UltraScale+ FPGAs
Packaging and Pinouts User Guide for a more detailed explanation of the device markings.
X-Ref Target - Figure 3
V: RoHS 6/6
Speed Grade: G: RoHS 6/6 with Exemption 15
-1: Slowest F: Lid
-L1: Low Power L: Lid SSI
-H1: Slowest or Mid B: Bare-die
-2: Mid
-3: Fastest F: Flip-chip with 1.0mm Ball Pitch
S: Flip-chip with 0.8mm Ball Pitch
1) -L1 and -H1 are the ordering codes for the -1L and -1H speed grades, respectively.
2) See UG575: UltraScale and UltraScale+ FPGAs Packaging and Pinouts User Guide for more information.
DS890_03_092917
The ordering information shown in Figure 4 applies to all packages in the Kintex UltraScale+ and Virtex
UltraScale+ FPGAs, and Figure 5 applies to Zynq UltraScale+MPSoCs and RFSoCs.
The -1L and -2L speed grades in the UltraScale+ families can run at one of two different VCCINT operating
voltages. At 0.72V, they operate at similar performance to the Kintex UltraScale and Virtex UltraScale
devices with up to 30% reduction in power consumption. At 0.85V, they consume similar power to the
Kintex UltraScale and Virtex UltraScale devices, but operate over 30% faster.
X-Ref Target - Figure 4
Example: XC VU 7 P -1 F L V A2104 E
Temperature Grade
Xilinx Commercial E: Extended
KU: Kintex UltraScale I: Industrial
VU: Virtex UltraScale
Value Index
Package Designator and Pin Count
+ (Plus) (Footprint Identifier)
1) -L1 and -L2 are the ordering codes for the low power -1L and -2L speed grades, respectively.
DS890_04_092917
Example: XC ZU 7 E V -1 F F V C1156 E
Temperature Grade
Xilinx Commercial E: Extended
ZU: Zynq UltraScale+ I: Industrial
Value Index
Package Designator and Pin Count
Processor System Identifier (Footprint Identifier)
C: Dual APU, Dual RPU
D: Quad APU; Dual RPU V: RoHS 6/6
E: Quad APU, Dual RPU, Single GPU
Engine Type F: Lid
G: General Purpose S: Lidless Stiffener
R: RF Signal B: Bare-die
V: Video
F: Flip-chip with 1.0mm Ball Pitch
Speed Grade
S: Flip-chip with 0.8mm Ball Pitch
-1: Slowest
-L1: Low Power
-2: Mid
-L2: Low Power
-3: Fastest
1) -L1 and -L2 are the ordering codes for the low power -1L and -2L speed grades, respectively.
DS890_05_032118
Revision History
The following table shows the revision history for this document:
Disclaimer
The information disclosed to you hereunder (the “Materials”) is provided solely for the selection and use of Xilinx products. To the
maximum extent permitted by applicable law: (1) Materials are made available “AS IS” and with all faults, Xilinx hereby DISCLAIMS
ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether
in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related
to, arising under, or in connection with, the Materials (including your use of the Materials), including for any direct, indirect,
special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage
suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had
been advised of the possibility of the same. Xilinx assumes no obligation to correct any errors contained in the Materials or to
notify you of updates to the Materials or to product specifications. You may not reproduce, modify, distribute, or publicly display
the Materials without prior written consent. Certain products are subject to the terms and conditions of Xilinx’s limited warranty,
please refer to Xilinx’s Terms of Sale which can be viewed at http://www.xilinx.com/legal.htm#tos; IP cores may be subject to
warranty and support terms contained in a license issued to you by Xilinx. Xilinx products are not designed or intended to be
fail-safe or for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx products
in such critical applications, please refer to Xilinx’s Terms of Sale which can be viewed at http://www.xilinx.com/ legal.htm#tos.
This document contains preliminary information and is subject to change without notice. Information provided herein relates to
products and/or services not yet available for sale, and provided solely for information purposes and are not intended, or to be
construed, as an offer for sale or an attempted commercialization of the products and/or services referred to herein.