VivadoHLS Overview PDF
VivadoHLS Overview PDF
VivadoHLS Overview PDF
This material exempt per Department of Commerce license exception TSU Copyright 2013 Xilinx
Objectives
High-Level Synthesis
Creates an RTL implementation from C level
source code C, C++, Constraints/
SystemC Directives
Extracts control and dataflow from the source code
Implements the design based on defaults and
user applied directives Vivado HLS
The same hardware is used for each iteration of Different hardware is used for each iteration of the Different iterations are executed concurrently:
the loop: loop: Higher area
Small area Higher area Short latency
Long latency Short latency Best throughput
Low throughput Better throughput
acc=0;
loop: for (i=3;i>=0;i--) { For-Loop Start
if (i==0) {
acc+=x*c[0]; 1
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1];
acc+=shift_reg[i]*c[i];
}
} For-Loop End
*y=acc; 2
} Function End
From any C code example .. The loops in the C code correlated to states This behavior is extracted into a hardware
of behavior state machine
From any C code example .. Operations are The control is A unified control dataflow behavior is
extracted known created.
Scheduling Binding
User RTL
(Verilog, VHDL, SystemC)
Directives
Scheduling
The operations in the control flow graph are mapped into clock cycles
void foo ( a
*
t1 = a * b;
b
+
t2 = c + t1; c
t3 = d * t2; d *
out = t3 e;
} e - out
Schedule 1
* + * -
The technology and user constraints impact the schedule
A faster technology (or slower clock) may allow more operations to occur in the same clock cycle
Schedule 2 * + * -
Binding is where operations are mapped to cores from the hardware library
Operators map to cores
Binding may decide to share the multipliers (each is used in a different cycle)
Or it may decide the cost of sharing (muxing) would impact timing and it may decide not to share them
It may make this same decision in the first example above too
Outline
HLS
Vivado HLS determines in which cycle operations should occur (scheduling)
Determines which hardware units to use for each operation (binding)
It performs HLS by :
Obeying built-in defaults
Obeying user directives & constraints to override defaults
Calculating delays and area using the specified technology/device
Understand the priority of directives
1. Meet Performance (clock & throughput)
Vivado HLS will allow a local clock path to fail if this is required to meet throughput
Often possible the timing can be met after logic synthesis
2. Then minimize latency
3. Then minimize area
From any C code example ... Operations are The C types define the size of the hardware used:
extracted handled automatically
void foo_top () {
... foo_top
Add: for (i=3;i>=0;i--) {
b = a[i] + b;
...
} Synthesis b
+
a[N]
Loops require labels if they are to be referenced by Tcl
directives
(GUI will auto-add labels)
Loops can be unrolled if their indices are statically determinable at elaboration time
Not when the number of iterations is variable
Unrolled loops result in more elements to schedule but greater operator mobility
Lets look at an example .
Intro to HLS 11- 17 Copyright 2013 Xilinx
if (i==0) { - - + - - + - - + - RDx +
acc+=x*c[0]; RDc RDc RDc RDc
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1]; Iteration 1 Iteration 2 Iteration 3 Iteration 4
acc+=shift_reg[i]*c[i];
}
} The read X operation has
*y=acc;
} good mobility
if (i==0) { - - + - - + - - + - RDx +
acc+=x*c[0]; RDc RDc RDc RDc
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1]; Iteration 1 Iteration 2 Iteration 3 Iteration 4
acc+=shift_reg[i]*c[i];
}
} Mult is very
*y=acc;
} constrained
* *
If data dependencies allow
* *
If operator timing allows + +
+
Design finished faster but uses more operators WRy
Top-Level IO Ports
}
CE0
WE0
CE1
Default RAM resource WE1
Dual port RAM if performance can be improved otherwise Single Port RAM
Intro to HLS 11- 22 Copyright 2013 Xilinx
Schedule after an Array Optimization
cycle
RDc
RDc
RDc
RDx
All reads and mults can occur in one cycle *
If the timing allows *
*
The additions can also occur in the same cycle
*
The write can be performed in the same cycles +
+
Optionally the port reads and writes could be registered
+
WRy
Operators
Comprehensive C Support
Outline
Pre-synthesis: C Validation
Validate the algorithm is correct C, C++, Constraints/
SystemC Directives
Post-synthesis: RTL Verification
Verify the RTL is correct
C validation Vivado HLS
A HUGE reason users want to use HLS
Fast, free verification
Validate the algorithm is correct before synthesis
VHDL
Follow the test bench tips given over Verilog
System C
func_AB.c
#include func_AB.h
func_AB(a,b,c, *i1, *i2) {
...
func_A(a,b,*i1); func_A
Recommendation is to separate test func_B(c,*i1,*i2); func_B
bench and design files
}
Outline
In HLS
C becomes RTL
Operations in the code map to hardware resources
Understand how constructs such as functions, loops and arrays are synthesized
HLS design involves
Synthesize the initial design
Analyze to see what limits the performance
User directives to change the default behaviors
Remove bottlenecks
Analyze to see what limits the area
The types used define the size of operators
This can have an impact on what operations can fit in a clock cycle
Use directives to shape the initial design to meet performance
Increase parallelism to improve performance
Refine bit sizes and sharing to reduce area
This material exempt per Department of Commerce license exception TSU Copyright 2013 Xilinx
Objectives
12- 4
Using Vivado HLS 12 - 4 Copyright 2013 Xilinx
Vivado HLS GUI
Information
Auxiliary Pane
Pane
Project
Explorer
Pane
Console
Pane
12- 5
Using Vivado HLS 12 - 5 Copyright 2013 Xilinx
Outline
12- 8
Using Vivado HLS 12 - 8 Copyright 2013 Xilinx
Project Wizard
The Project Wizard guides users through the steps of opening a new project
Step-by-step guide
Define project and Add design source Specify test bench Specify clock and
directory files files select part
// test.c
#include <stdio.h>
void test (int d[10]) {
int acc = 0;
int i;
for (i=0;i<10;i++) { Design to be synthesized
acc += d[i];
d[i] = acc;
}
}
#ifndef __SYNTHESIS__
int main () {
int d[10], i;
for (i=0;i<10;i++) {
d[i] = i;
} Test Bench
test(d); Nothing in this ifndef will be read
for (i=0;i<10;i++) {
printf("%d %d\n", i, d[i]);
by Vivado HLS
} (will be read by gcc)
return 0;
}
#endif
Test benches II
Information Pane
Can view and edit any file from the
Project Explorer
Auxiliary Pane
Project Explorer Cross-referenced with the Information Pane
Project files displayed in a (here it shows objects in the source code)
hierarchal view
Console Pane
Displays Vivado HLS run time messages
Export RTL
Change Solution Settings
Run C/RTL Cosimulation
Run C Simulation
Run C Synthesis
Synthesis
Run C Synthesis
Console
Will show run time information
Examine for failed constraints
A syn directory is created
Verilog, VHDL & SystemC RTL
Synthesis reports for all non-inlined
functions
Report opens automatically
When synthesis completes
Report is outlined in the
Auxiliary pane
RTL Co-Simulation
Vivado HLS provides RTL verification
Creates the wrappers and adapters to re-use the C test bench
main.c(pp) main.c(pp)
A
DUT wrapper A
dut.c(pp) d d
Synthesis a
RTL
a
p p
t t
e e
r r
12- 25
Using Vivado HLS 12 - 25 Copyright 2013 Xilinx
C/RTL Co-simulation
Start Simulation
Opens the dialog box
Select the RTL
SystemC does not require a 3rd party license
Verilog and VHDL require the appropriate simulator
Select the desired simulator
Run any or all
Options
Can output trace file (VCD format)
The SystemC simulation can always
Optimize the C compilation & specify test bench linker flags be run: no simulator license required!
RTL Export
Can be exported to one of the three types
IP-XACT formatted IP for use with Vivado System Edition (SE)
7 Series and Zynq families only
A System Generator IP block
7 Series and Zynq families only
Pcore formated IP block for use with EDK
7 Series, Zynq, Spartan-3, Spartan-6, Virtex-4/5/6 families
Generation in both Verilog and VHDL for non-bus or non-interface based designs
Logic synthesis will automatically be performed
HLS license will use Vivado RTL Synthesis
12- 29
Using Vivado HLS 12 - 29 Copyright 2013 Xilinx
solution1 solutionN
In Vivado :
1. Project Manager > IP Catalog
impl syn sim
2. Add IP to import this block
3. Browse to the zip file inside ip
ip sysgen pcore
In System Generator :
1. Use XilinxBlockAdd
2. Select Vivado_HLS block type In EDK :
3. Browse to the solution directory 1. Copy the contents of the pcore direcory
2. Paste into the EDK project pcore direcotry
3. Project > Rescan Local Repository
Analysis Perspective
Resources Analysis
Supports the commands required to run Vivado HLS & pre-synthesis verification (gcc, g++, apcc, make)
12- 40
Using Vivado HLS 12 - 40 Copyright 2013 Xilinx
Using Vivado HLS CLI
12- 41
Using Vivado HLS 12 - 41 Copyright 2013 Xilinx
In interactive mode
The help command lists the man page for all commands
Vivado_hls> help add_files
Auto-Complete all commands using the tab
SYNOPSIS key
add_files [OPTIONS] <src_files>
Etc
Outline
Vivado HLS can be run under Windows XP, Windows 7, Red Hat Linux, and SUSE OS
Vivado HLS can be invoked through GUI and command line in Windows OS, and
command line in Linux
Vivado HLS project creation wizard involves
Defining project name and location
Adding design files
Specifying testbench files
Selecting clock and technology
The top-level module in testbench is main() whereas top-level module in the design is
the function to be synthesized
12- 45
Using Vivado HLS 12 - 45 Copyright 2013 Xilinx
Summary
This material exempt per Department of Commerce license exception TSU Copyright 2013 Xilinx
Objectives
This lab uses a simple matrix multiplication example to walk you through the Vivado
HLS project creation and analysis steps. The design consists of three nested loops.
The Product loop is the inner most loop performing the actual elements product. The
Col loop is the outer-loop which feeds next column element data with the passed row
element data to the Product loop. Finally, Row is the outer-most loop. The res[i][j]=0
(line 79) resets the result every time a new row element is passed and new column
element is used
Procedure
In this lab, you completed the major steps of the high-level synthesis design flow using
Vivado HLS. You created a project, added source files, synthesized the design,
simulated the design, and implemented the design. You also learned that how to use the
Analysis perspective to understand the scheduling