0% found this document useful (0 votes)
39 views56 pages

19 20 IntroFPGA PDF

This document provides an introduction to field programmable gate arrays (FPGAs). It discusses the key differences between FPGAs and processors, including that FPGAs allow for user-defined parallelism rather than instruction-level parallelism. The document outlines the basic architecture of FPGAs, including lookup tables, logic blocks, routing, and I/O standards. It also discusses FPGA design flows and tradeoffs in logic and routing architectures that impact the performance, power, and area of the FPGA.

Uploaded by

baluvelp
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views56 pages

19 20 IntroFPGA PDF

This document provides an introduction to field programmable gate arrays (FPGAs). It discusses the key differences between FPGAs and processors, including that FPGAs allow for user-defined parallelism rather than instruction-level parallelism. The document outlines the basic architecture of FPGAs, including lookup tables, logic blocks, routing, and I/O standards. It also discusses FPGA design flows and tradeoffs in logic and routing architectures that impact the performance, power, and area of the FPGA.

Uploaded by

baluvelp
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Introduction to FPGAs

Madhura Purnaprajna

Outline
Whats different about FPGAs Architecture
Logic Routing I/O

State-of-the-art: Xilinx Virtex 7

The Applications ...


Medical

Consumer High Performance Computing

Communications
3

The Industry ...

The two domains ...

Processors
Sequential computing Instruction-level parallelism
Instruction Memory Decoder Registers

ALU
Data Memory Registers
6

FPGAs
User configurable User-defined parallelism

00 0 01 1 10 1 11 1

FFs

FFs

FFs

FU
FFs FU FFs FU

FU
FFs FU FFs FU

FU
FFs FU FFs FU
7

Application Mapping
Processor FPGA

<N

Temporal vs Spatial Computing


Processor
Instruction Memory Decoder 00 0 01 1 10 1 11 1

FPGA
FFs FU FFs FFs FU FFs FFs FU FFs

Registers
ALU Data Memory

FU FFs
FU

FU FFs
FU

FU FFs
FU

Registers

x Limited parallelism x Fixed architecture x Scalability?

User-defined parallelism Flexibility Performance per Watt 9

Performance vs Adaptability

Processors Ease of Adaptability

~35x
FPGA ~5x ~15x

Area Speed Power

ASIC Performance
Measuring the gap between FPGAs and ASICs, Ian Kuon and Jonathan Rose, FPGA 2006

10

FPGA Architecture
Programmable Logic Programmable Routing

11

Logic: Lookup Tables


LUT
FF

LUT

FF

2K SRAM

2K:1 MUX

LUT

FF

LUT

FF

Slice/Cluster
12

Look-up Table
2K SRAM Cells
2K SRAM
2K:1 MUX

K 2 2

different functions

2K:1 MUX
K-levels of 2:1 muxes

13

Look-up Table: 2-inputs


22 SRAM Cells
22 SRAM
22:1 MUX

2 2 2 different

functions

22:1 MUX
2-levels of 2:1 muxes

14

Look-up Table: 2-input NAND


4 SRAM Cells
6 transistors each

1 1 1 0

4:1 MUX
4:1 MUX

~12 transistors

~40 Transistors

15

Look-up Table: 2-input NAND


HUGE!

1 1 1 0

4:1 MUX

40 Transistors

4 Transistors
16

Design Flow: FPGA


Benchmark Circuits HDL
Logic Synthesis FPGA Architecture Technology Mapping Pack, Place & Route FPGA Area, Power, Speed

17

LOGIC BLOCK ARCHITECTURE

Logic: Soft
Programmable Logic Blocks

19

Logic: Hard Blocks


Memory Blocks

20

Logic: Hard Blocks


DSP Blocks

21

Logic: Lookup Tables


LUT
FF

LUT

FF

2K SRAM

2K:1 MUX

LUT

FF

LUT

FF

Slice/ Cluster
22

Design decisions
LUT size Number of LUTs per cluster Inputs/Outputs to/from each cluster Area and Speed

No. of Logic Blocks vs. Logic Block Functionality

LUT size increases exponentially with K Routing tracks surrounding logic increases with the number of input pins

Total FPGA area vs. LUT size

Terminology
Basic logic element (BLE) Cluster
Size grows quadratically Local interconnect Fewer inputs (shared)
LUT
FF

LUT

FF

LUT

FF

LUT

FF

LUTs on critical path & LUT delay vs LUT size


Functionality increases=> fewer logic blocks on critical path => internal delay increases

Critical path: Function of LUT and Cluster size


Diminishing returns beyond LUT6 and cluster size 3,4

HETEROGENEOUS BLOCKS

Choice of functions
Which function? Ratio of special function to generic logic? What to do with special function blocks when they are not used?

Hard blocks
FFs (set, reset, enable, load,) Add, sub, carry logic, Use LUTs as memories Block RAMs/ ROMs, FIFOs Multipliers (fracturable) Processors

Challenge
Performance, power, area
As compared to ASICs

Introduce other hard blocks


Floating point units, etc.

Shadow logic

ROUTING ARCHITECTURE

Routing in FPGAs
Connect logic blocks and I/O
To define a user circuit

Flexible
Support local and distant routing demands

Locality
Short, Fast, with intermediate long wires

Global clocks and resets

Routing details
Global routing
Macroscopic allocation of wires Relative position of routing channels to logic blocks Wires in each channel

Detailed routing
Microscopic Length of wires Switching quantity

Routing Architectures
Hierarchical Island style

Hierarchical Routing
Groups of logic blocks Interconnected levels Used in:
Altera FLEX, APEX

Hierarchical Routing
Advantages:
Predictable inter-logic block delay Superior performance for some designs

Disadvantages:
Over use of logic blocks (mismatch in design and FPGA hierarchy) Large variation in inter-block delay

Island style
2-D mesh: evenly distributed routing resources routing channels on four sides Each channel has W wires Wire segments of different lengths in each channel Used in present day commercial FPGAs

Island style
Advantages:
Efficient connection for varying net lengths Staggering start/end points, optimise for a tile Regular, min delay can be estimated

Details
Switch blocks

Connection blocks

Channel segmentation distribution

Short wires: 1 block Long wires: Multiple blocks

Routing hops

Switch block: disjoint


Numerical designation of wire entering = wire exiting 0-0 1-1 Limits flexibility Distinct routing domains

Switch block: Wilton


Allows change in domains for turns 0(left)-3(bottom) 0(left)-0(top)

I/O STANDARDS

I/O Architecture
Sets external interface rates Occupies significant area
~40%

Choice of I/O standard


Performance (Pin capacitance) Area

Common I/O Standards

Selection
I/O banks
Groups of I/O cells Share supply/reference voltage Each bank has different I/O standard

Highspeed I/O
High speed inter-chip signaling
SERDES (serialiser/deserialiser)
Source sync clocking Dynamic clock phase adjustment

High-bandwidth memory interface


Ethernet MAC DLLs/PLLs

PROGRAMMING TECH

Programming Technology
SRAM Cells
Reusability Standard CMOS

Programming Technologies

Improving FPGAs
Reducing the gap: Area, Speed, Power

Alternatives to FPGAs
CGRAs Structured ASICs

References
FPGA Architecture: Survey and Challenges
Ian Kuon, Russell Tessier, Jonathan Rose

Questions?

You might also like