VivadoHLS Overview PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Introduction to High-Level Synthesis

with Vivado HLS

Vivado HLS 2013.3 Version

This material exempt per Department of Commerce license exception TSU Copyright 2013 Xilinx

Objectives

After completing this module, you will be able to:

Describe the high level synthesis flow


Understand the control and datapath extraction
Describe scheduling and binding phases of the HLS flow
List the priorities of directives set by Vivado HLS
List comprehensive language support in Vivado HLS
Identify steps involved in validation and verification flows

Intro to HLS 11- 2 Copyright 2013 Xilinx


Outline

Introduction to High-Level Synthesis


High-Level Synthesis with Vivado HLS
Language Support
Validation Flow
Summary

Intro to HLS 11- 3 Copyright 2013 Xilinx

High-Level Synthesis: HLS

High-Level Synthesis
Creates an RTL implementation from C level

source code C, C++, Constraints/
SystemC Directives
Extracts control and dataflow from the source code
Implements the design based on defaults and
user applied directives Vivado HLS

Many implementation are possible from the


same source description

VHDL
Verilog
Smaller designs, faster designs, optimal designs System C

Enables design exploration


RTL Export
IP-XACT Sys Gen PCore

Intro to HLS 11- 4 Copyright 2013 Xilinx


Design Exploration with Directives

loop: for (i=3;i>=0;i--) {
if (i==0) {
One body of code: acc+=x*c[0]; Before we get into details, lets look
shift_reg[0]=x;
Many hardware outcomes } else { under the hood .
shift_reg[i]=shift_reg[i-1];
acc+=shift_reg[i]*c[i];
}
}
.

The same hardware is used for each iteration of Different hardware is used for each iteration of the Different iterations are executed concurrently:
the loop: loop: Higher area
Small area Higher area Short latency
Long latency Short latency Best throughput
Low throughput Better throughput

Intro to HLS 11- 5 Copyright 2013 Xilinx

Introduction to High-Level Synthesis

How is hardware extracted from C code?


Control and datapath can be extracted from C code at the top level
The same principles used in the example can be applied to sub-functions
At some point in the top-level control flow, control is passed to a sub-function
Sub-function may be implemented to execute concurrently with the top-level and or other sub-functions
How is this control and dataflow turned into a hardware design?
Vivado HLS maps this to hardware through scheduling and binding processes
How is my design created?
How functions, loops, arrays and IO ports are mapped?

Intro to HLS 11- 6 Copyright 2013 Xilinx


HLS: Control Extraction

Code Control Behavior

void fir ( Finite State Machine (FSM)


data_t *y, states
coef_t c[4],
data_t x
){ Function Start
static data_t shift_reg[4];
acc_t acc; 0
int i;

acc=0;
loop: for (i=3;i>=0;i--) { For-Loop Start
if (i==0) {
acc+=x*c[0]; 1
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1];
acc+=shift_reg[i]*c[i];
}
} For-Loop End
*y=acc; 2
} Function End

From any C code example .. The loops in the C code correlated to states This behavior is extracted into a hardware
of behavior state machine

Intro to HLS 11- 7 Copyright 2013 Xilinx

HLS: Control & Datapath Extraction

Code Operations Control Behavior Control & Datapath Behavior

void fir ( Finite State Machine (FSM) Control Dataflow


data_t *y, states
coef_t c[4],
data_t x RDx
){
RDc
static data_t shift_reg[4];
acc_t acc; >= 0
int i; RDx RDc
-
acc=0; >= -
loop: for (i=3;i>=0;i--) { == == -
if (i==0) {
acc+=x*c[0]; + 1 + *
shift_reg[0]=x;
} else { * + *
shift_reg[i]=shift_reg[i-1];
acc+=shift_reg[i]*c[i]; +
}
} *
*y=acc;
WRy
2 WRy
}

From any C code example .. Operations are The control is A unified control dataflow behavior is
extracted known created.

Intro to HLS 11- 8 Copyright 2013 Xilinx


High-Level Synthesis: Scheduling & Binding

Scheduling & Binding


Scheduling and Binding are at the heart of HLS
Scheduling determines in which clock cycle an operation will occur
Takes into account the control, dataflow and user directives
The allocation of resources can be constrained
Binding determines which library cell is used for each operation
Takes into account component delays, user directives
Design Source Technology
(C, C++, SystemC) Library

Scheduling Binding

User RTL
(Verilog, VHDL, SystemC)
Directives

Intro to HLS 11- 9 Copyright 2013 Xilinx

Scheduling

The operations in the control flow graph are mapped into clock cycles
void foo ( a
*
t1 = a * b;
b
+
t2 = c + t1; c
t3 = d * t2; d *
out = t3 e;
} e - out

Schedule 1
* + * -
The technology and user constraints impact the schedule
A faster technology (or slower clock) may allow more operations to occur in the same clock cycle
Schedule 2 * + * -

The code also impacts the schedule


Code implications and data dependencies must be obeyed

Intro to HLS 11- 10 Copyright 2013 Xilinx


Binding

Binding is where operations are mapped to cores from the hardware library
Operators map to cores

Binding Decision: to share


Given this schedule:
* + * -
Binding must use 2 multipliers, since both are in the same cycle
It can decide to use an adder and subtractor or share one addsub

Binding Decision: or not to share


Given this schedule:
* + * >

Binding may decide to share the multipliers (each is used in a different cycle)
Or it may decide the cost of sharing (muxing) would impact timing and it may decide not to share them
It may make this same decision in the first example above too

Intro to HLS 11- 11 Copyright 2013 Xilinx

Outline

Introduction to High-Level Synthesis


High-Level Synthesis with Vivado HLS
Language Support
Validation Flow
Summary

Intro to HLS 11- 12 Copyright 2013 Xilinx


Understanding Vivado HLS Synthesis

HLS
Vivado HLS determines in which cycle operations should occur (scheduling)
Determines which hardware units to use for each operation (binding)
It performs HLS by :
Obeying built-in defaults
Obeying user directives & constraints to override defaults
Calculating delays and area using the specified technology/device
Understand the priority of directives
1. Meet Performance (clock & throughput)
Vivado HLS will allow a local clock path to fail if this is required to meet throughput
Often possible the timing can be met after logic synthesis
2. Then minimize latency
3. Then minimize area

Intro to HLS 11- 13 Copyright 2013 Xilinx

The Key Attributes of C code

Functions: All code is made up of functions which represent the design


void fir ( hierarchy: the same in hardware
data_t *y,
coef_t c[4],
data_t x Top Level IO : The arguments of the top-level function determine the
){ hardware RTL interface ports
static data_t shift_reg[4];
acc_t acc; Types: All variables are of a defined type. The type can influence the area
int i; and performance
acc=0;
loop: for (i=3;i>=0;i--) { Loops: Functions typically contain loops. How these are handled can have a
if (i==0) {
acc+=x*c[0]; major impact on area and performance
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1]; Arrays: Arrays are used often in C code. They can influence the device IO
acc+=shift_reg[i] * c[i];
}
and become performance bottlenecks
}
*y=acc;
}
Operators: Operators in the C code may require sharing to control area or
specific hardware implementations to meet performance

Lets examine the default synthesis behavior of these

Intro to HLS 11- 14 Copyright 2013 Xilinx


Functions & RTL Hierarchy

Each function is translated into an RTL block


Verilog module, VHDL entity
Source Code RTL hierarchy
void A() { ..body A..}
void B() { ..body B..} foo_top
void C() { C
B(); B
} A
void D() {
B();
} D
B
void foo_top() {
A();
C();
D() Each function/block can be shared like any other component (add, sub, etc) provided
} my_code.c
its not in use at the same time

By default, each function is implemented using a common instance


Functions may be inlined to dissolve their hierarchy
Small functions may be automatically inlined
Intro to HLS 11- 15 Copyright 2013 Xilinx

Types = Operator Bit-sizes

Code Operations Types

void fir ( Standard C types


data_t *y,
coef_t c[4], long long (64-bit) short (16-bit) unsigned types
data_t x RDx
){ int (32-bit) char (8-bit)
RDc
static data_t shift_reg[4]; float (32-bit) double (64-bit)
acc_t acc; >=
int i;
-
acc=0;
loop: for (i=3;i>=0;i--) { == Arbitary Precision types
if (i==0) {
acc+=x*c[0]; + C: ap(u)int types (1-1024)
shift_reg[0]=x;
} else { * C++: ap_(u)int types (1-1024)
shift_reg[i]=shift_reg[i-1]; ap_fixed types
acc+=shift_reg[i]*c[i]; + C++/SystemC: sc_(u)int types (1-1024)
}
} * sc_fixed types
*y=acc;
} WRy Can be used to define any variable to be a specific bit-width (e.g. 17-bit, 47-
bit etc).

From any C code example ... Operations are The C types define the size of the hardware used:
extracted handled automatically

Intro to HLS 11- 16 Copyright 2013 Xilinx


Loops

By default, loops are rolled


Each C loop iteration  Implemented in the same state N
Each C loop iteration  Implemented with same resources

void foo_top () {
... foo_top
Add: for (i=3;i>=0;i--) {
b = a[i] + b;
...
} Synthesis b

+
a[N]
Loops require labels if they are to be referenced by Tcl
directives
(GUI will auto-add labels)

Loops can be unrolled if their indices are statically determinable at elaboration time
Not when the number of iterations is variable
Unrolled loops result in more elements to schedule but greater operator mobility
Lets look at an example .
Intro to HLS 11- 17 Copyright 2013 Xilinx

Data Dependencies: Good

void fir ( Default Schedule



acc=0;
loop: for (i=3;i>=0;i--) {
== * >= == * >= == * >= == * >= WRy

if (i==0) { - - + - - + - - + - RDx +
acc+=x*c[0]; RDc RDc RDc RDc
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1]; Iteration 1 Iteration 2 Iteration 3 Iteration 4
acc+=shift_reg[i]*c[i];
}
} The read X operation has
*y=acc;
} good mobility

Example of good mobility


The read on data port X can occur anywhere from the start to iteration 4
The only constraint on RDx is that it occur before the final multiplication
Vivado HLS has a lot of freedom with this operation
It waits until the read is required, saving a register
There are no advantages to reading any earlier (unless you want it registered)
Input reads can be optionally registered
The final multiplication is very constrained

Intro to HLS 11- 18 Copyright 2013 Xilinx


Data Dependencies: Bad

void fir ( Default Schedule



acc=0;
loop: for (i=3;i>=0;i--) {
== * >= == * >= == * >= == * >= WRy

if (i==0) { - - + - - + - - + - RDx +
acc+=x*c[0]; RDc RDc RDc RDc
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1]; Iteration 1 Iteration 2 Iteration 3 Iteration 4
acc+=shift_reg[i]*c[i];
}
} Mult is very
*y=acc;
} constrained

Example of bad mobility


The final multiplication must occur before the read and final addition
It could occur in the same cycle if timing allows
Loops are rolled by default
Each iteration cannot start till the previous iteration completes
The final multiplication (in iteration 4) must wait for earlier iterations to complete
The structure of the code is forcing a particular schedule
There is little mobility for most operations
Optimizations allow loops to be unrolled giving greater freedom
Intro to HLS 11- 19 Copyright 2013 Xilinx

Schedule after Loop Optimization

With the loop unrolled (completely)


RDc RDc
The dependency on loop iterations is gone RDc RDc

Operations can now occur in parallel RDx

* *
If data dependencies allow
* *
If operator timing allows + +
+
Design finished faster but uses more operators WRy

2 multipliers & 2 Adders


void fir (

Schedule Summary acc=0;
loop: for (i=3;i>=0;i--) {
All the logic associated with the loop counters and index checking are if (i==0) {
acc+=x*c[0];
now gone shift_reg[0]=x;
} else {
Two multiplications can occur at the same time shift_reg[i]=shift_reg[i-1];
acc+=shift_reg[i]*c[i];
}
All 4 could, but its limited by the number of input reads (2) on coefficient port C }
*y=acc;
Why 2 reads on port C? }

The default behavior for arrays now limits the schedule


Intro to HLS 11- 20 Copyright 2013 Xilinx
Arrays in HLS

An array in C code is implemented by a memory in the RTL


By default, arrays are implemented as RAMs, optionally a FIFO foo_top
N-1 SPRAMB
void foo_top(int x, ) A[N]
{ N-2 A_in DIN DOUT A_out
int A[N];
L1: for (i = 0; i < N; i++) Synthesis ADDR
A[i+x] = A[i] + i;
1 CE
}
0 WE

The array can be targeted to any memory resource in the library


The ports (Address, CE active high, etc.) and sequential operation (clocks from address to data out)
are defined by the library model
All RAMs are listed in the Vivado HLS Library Guide
Arrays can be merged with other arrays and reconfigured
To implement them in the same memory or one of different widths & sizes
Arrays can be partitioned into individual elements
Implemented as smaller RAMs or registers

Intro to HLS 11- 21 Copyright 2013 Xilinx

Top-Level IO Ports

Top-level function arguments


All top-level function arguments have a default hardware port type
When the array is an argument of the top-level function
The array/RAM is off-chip
The type of memory resource determines the top-level IO ports
Arrays on the interface can be mapped & partitioned
E.g. partitioned into separate ports for each element in the array

void foo_top( int A[3*N] , int x) DPRAMB


{ foo_top
L1: for (i = 0; i < N; i++) DIN0 DOUT0
A[i+x] = A[i] + i; Synthesis ADDR0
+

}
CE0
WE0

Number of ports defined by the DIN1 DOUT1


RAM resource ADDR1

CE1
Default RAM resource WE1

Dual port RAM if performance can be improved otherwise Single Port RAM
Intro to HLS 11- 22 Copyright 2013 Xilinx
Schedule after an Array Optimization

loop: for (i=3;i>=0;i--) {


With the existing code & defaults RDc RDc
if (i==0) {
RDc RDc acc+=x*c[0];
Port C is a dual port RAM RDx shift_reg[0]=x;
} else {
Allows 2 reads per clock cycles * * shift_reg[i]=shift_reg[i-1];
acc+=shift_reg[i]*c[i];
* * }
IO behavior impacts performance + + }
+ *y=acc;
Note: It could have performed 2 reads in the original rolled design but
there was no advantage since the rolled loop forced a single read per WRy

cycle
RDc

RDc

RDc

With the C port partitioned into (4) separate ports RDc

RDx
All reads and mults can occur in one cycle *
If the timing allows *
*
The additions can also occur in the same cycle
*
The write can be performed in the same cycles +
+
Optionally the port reads and writes could be registered
+
WRy

Intro to HLS 11- 23 Copyright 2013 Xilinx

Operators

Operator sizes are defined by the type


The variable type defines the size of the operator
Vivado HLS will try to minimize the number of operators
By default Vivado HLS will seek to minimize area after constraints are satisfied
User can set specific limits & targets for the resources used
Allocation can be controlled
An upper limit can be set on the number of operators or cores allocated for the design: This can be used to force sharing
e.g limit the number of multipliers to 1 will force Vivado HLS to share
Use 1 mult, but take 4 cycle even if it could be done in
1 cycle using 4 mults
3 2 1 0

Resources can be specified


The cores used to implement each operator can be specified
e.g. Implement each multiplier using a 2 stage pipelined core (hardware)

Same 4 mult operations could be done with 2 pipelined


3 1 mults (with allocation limiting the mults to 2)
2 0

Intro to HLS 11- 24 Copyright 2013 Xilinx


Outline

Introduction to High-Level Synthesis


High-Level Synthesis with Vivado HLS
Language Support
Validation Flow
Summary

Intro to HLS 11- 25 Copyright 2013 Xilinx

Comprehensive C Support

A Complete C Validation & Verification Environment


Vivado HLS supports complete bit-accurate validation of the C model
Vivado HLS provides a productive C-RTL co-simulation verification solution
Vivado HLS supports C, C++ and SystemC
Functions can be written in any version of C
Wide support for coding constructs in all three variants of C
Modeling with bit-accuracy
Supports arbitrary precision types for all input languages
Allowing the exact bit-widths to be modeled and synthesized
Floating point support
Support for the use of float and double in the code
Support for OpenCV functions
Enable migration of OpenCV designs into Xilinx FPGA
Libraries target real-time full HD video processing
Intro to HLS 11- 26 Copyright 2013 Xilinx
C, C++ and SystemC Support

The vast majority of C, C++ and SystemC is supported


Provided it is statically defined at compile time
If its not defined until run time, it won be synthesizable

Any of the three variants of C can be used


If C is used, Vivado HLS expects the file extensions to be .c
For C++ and SystemC it expects file extensions .cpp

Intro to HLS 11- 27 Copyright 2013 Xilinx

Outline

Introduction to High-Level Synthesis


High-Level Synthesis with Vivado HLS
Language Support
Validation Flow
Summary

Intro to HLS 11- 28 Copyright 2013 Xilinx


C Validation and RTL Verification

There are two steps to verifying the design Validate C

Pre-synthesis: C Validation

Validate the algorithm is correct C, C++, Constraints/
SystemC Directives
Post-synthesis: RTL Verification
Verify the RTL is correct
C validation Vivado HLS
A HUGE reason users want to use HLS
Fast, free verification
Validate the algorithm is correct before synthesis

VHDL
Follow the test bench tips given over Verilog
System C

RTL Verification Verify RTL


Vivado HLS can co-simulate the RTL with the RTL Export
original test bench IP-XACT Sys Gen PCore

Intro to HLS 11- 29 Copyright 2013 Xilinx

C Function Test Bench

The test bench is the level above the function


The main() function is above the function to be synthesized
Good Practices
The test bench should compare the results with golden data
Automatically confirms any changes to the C are validated and verifies the RTL is correct
The test bench should return a 0 if the self-checking is correct
Anything but a 0 (zero) will cause RTL verification to issue a FAIL message
Function main() should expect an integer return (non-void)
int main () {
int ret=0;

ret = system("diff --brief -w output.dat output.golden.dat");
if (ret != 0) {
printf("Test failed !!!\n");
ret=1;
} else {
printf("Test passed !\n");
}

return ret;
}

Intro to HLS 11- 30 Copyright 2013 Xilinx


Determine or Create the top-level function

Determine the top-level function for synthesis


If there are Multiple functions, they must be merged
There can only be 1 top-level function for synthesis
Given a case where functions func_A and Re-partition the design to create a new single
func_B are to be implemented in FPGA top-level function inside main()
main.c main.c
#include func_AB.h
int main () {
int main (a,b,c,d) {
...
func_A
...
func_A(a,b,*i1);
// func_A(a,b,i1);
func_B(c,*i1,*i2); func_B
// func_B(c,i1,i2);
func_C(*i2,ret) func_C
func_AB (a,b,c, *i1, *i2); func_AB
func_C(*i2,ret) func_C
return ret;
}
return ret;
}

func_AB.c
#include func_AB.h
func_AB(a,b,c, *i1, *i2) {
...
func_A(a,b,*i1); func_A
Recommendation is to separate test func_B(c,*i1,*i2); func_B
bench and design files
}

Intro to HLS 11- 31 Copyright 2013 Xilinx

Outline

Introduction to High-Level Synthesis


High-Level Synthesis with Vivado HLS
Language Support
Validation Flow
Summary

Intro to HLS 11- 32 Copyright 2013 Xilinx


Summary

In HLS
C becomes RTL
Operations in the code map to hardware resources
Understand how constructs such as functions, loops and arrays are synthesized
HLS design involves
Synthesize the initial design
Analyze to see what limits the performance
User directives to change the default behaviors
Remove bottlenecks
Analyze to see what limits the area
The types used define the size of operators
This can have an impact on what operations can fit in a clock cycle
Use directives to shape the initial design to meet performance
Increase parallelism to improve performance
Refine bit sizes and sharing to reduce area

Intro to HLS 11- 33 Copyright 2013 Xilinx


Using Vivado HLS

Vivado HLS 2013.3 Version

This material exempt per Department of Commerce license exception TSU Copyright 2013 Xilinx

Objectives

After completing this module, you will be able to:

List various OS under which Vivado HLS is supported


Describe how projects are created and maintained in Vivado HLS
State various steps involved in using Vivado HLS project creation wizard
Distinguish between the role of top-level module in testbench and design to be synthesized
List various verifications which can be done in Vivado HLS
List Vivado HLS project directory structure

Using Vivado HLS 12 - 2 Copyright 2013 Xilinx


Outline

Invoking Vivado HLS


Project Creation using Vivado HLS
Synthesis to IPXACT Flow
Design Analysis
Other Ways to use Vivado HLS
Summary

Using Vivado HLS 12 - 3 Copyright 2013 Xilinx

Invoke Vivado HLS from Windows Menu

The first step is to open or create a


project

12- 4
Using Vivado HLS 12 - 4 Copyright 2013 Xilinx
Vivado HLS GUI

Information
Auxiliary Pane
Pane

Project
Explorer
Pane

Console
Pane

12- 5
Using Vivado HLS 12 - 5 Copyright 2013 Xilinx

Outline

Invoking Vivado HLS


Project Creation using Vivado HLS
Synthesis to IPXACT Flow
Design Analysis
Other Ways to use Vivado HLS
Summary

Using Vivado HLS 12 - 6 Copyright 2013 Xilinx


Vivado HLS Projects and Solutions

Vivado HLS is project based


A project specifies the source code which will be synthesized
Each project is based on one set of source code
Each project has a user specified name Source

A project can contain multiple solutions


Solutions are different implementations of the same code
Auto-named solution1, solution2, etc.
Supports user specified names
Solutions can have different clock frequencies, target technologies, synthesis
directives Project Level Solution Level

Projects and solutions are stored in a hierarchical directory structure


Top-level is the project directory
The disk directory structure is identical to the structure shown in the GUI project
explorer (except for source code location)
12- 7
Using Vivado HLS 12 - 7 Copyright 2013 Xilinx

Vivado HLS Step 1: Create or Open a project

Start a new project


The GUI will start the project wizard to guide you through all the steps

Optionally use the Toolbar Button to


Open New Project

Open an existing project


All results, reports and directives are automatically saved/remembered
Use Recent Project menu for quick access

12- 8
Using Vivado HLS 12 - 8 Copyright 2013 Xilinx
Project Wizard

The Project Wizard guides users through the steps of opening a new project

Step-by-step guide

Define project and Add design source Specify test bench Specify clock and
directory files files select part

Project Level 1st Solution


Information Information

Using Vivado HLS 12 - 9 Copyright 2013 Xilinx

Define Project & Directory

Define the project name


Note, here the project is given the
extension .prj
A useful way of seeing its a project (and
not just another directory) when browsing
Browse to the location of the project
In this example, project directory dct.prj will
be created inside directory lab1

Using Vivado HLS 12 - 10 Copyright 2013 Xilinx


Add Design Source Files

Add Design Source Files


This allows Vivado HLS to determine the top-level design for
synthesis, from the test bench & associated files
Not required for SystemC designs
Add Files
Select the source code file(s)
The CTRL and SHIFT keys can be used to add multiple files
No need to include headers (.h) if they reside in the same
directory
Select File and Edit CFLAGS
If required, specify C compile arguments using the Edit
There is no need to add the location of standard
CFLAGS
Vivado HLS or SystemC header files or header
Define macros: -DVERSION1 files located in the same project location
Location of any (header) files not in the same directory as the
source: -I../include

Using Vivado HLS 12 - 11 Copyright 2013 Xilinx

Specify Test Bench Files

Use Add Files to include the test bench


Vivado HLS will re-use these to verify the RTL using co-
simulation
And all files referenced by the test bench
The RTL simulation will be executed in a different directory
(Ensures the original results are not over-written)
Vivado HLS needs to also copy any files accessed by the
test bench
Input data and output results (*.dat) are shown in this
example
Add Folders
If the test bench uses relative paths like
sub_directory/my_file.dat you can add sub_directory as
a folder/directory
Use Edit CFLAGS
To add any C compile flags required for compilation
Using Vivado HLS 12 - 12 Copyright 2013 Xilinx
Test benches I

The test bench should be in a separate file


Or excluded from synthesis
The Macro __SYNTHESIS__ can be used to isolate code which will not be synthesized
This macro is defined when Vivado HLS parses any code (-D__SYNTHESIS__)

// test.c
#include <stdio.h>
void test (int d[10]) {
int acc = 0;
int i;
for (i=0;i<10;i++) { Design to be synthesized
acc += d[i];
d[i] = acc;
}
}
#ifndef __SYNTHESIS__
int main () {
int d[10], i;
for (i=0;i<10;i++) {
d[i] = i;
} Test Bench
test(d); Nothing in this ifndef will be read
for (i=0;i<10;i++) {
printf("%d %d\n", i, d[i]);
by Vivado HLS
} (will be read by gcc)
return 0;
}
#endif

Using Vivado HLS 12 - 13 Copyright 2013 Xilinx

Test benches II

Ideal test bench


Should be self checking
RTL verification will re-use the C test bench
If the test bench is self-checking
Allows RTL Verification to be run without a requirement to check the results again
RTL verification passes if the test bench return value is 0 (zero)
Actively return a 0 if the simulation passes
int main () { The w option ensures the
// Compare results newline does not cause a
int ret = system("diff --brief -w test_data/output.dat test_data/output.golden.dat"); difference between Windows and
if (ret != 0) { Linux files
printf("Test failed !!!\n", ret); return 1;
} else {
printf("Test passed !\n", ret); return 0;
}

Non-synthesizable constructs may be added to a synthesize function if __SYNTHESIS__ is used


#ifndef __SYNTHESIS__
image_t *yuv = (image_t *)malloc(sizeof(image_t));
#else // Workaround malloc() calls w/o changing rest of code
image_t _yuv;
#endif
Using Vivado HLS 12 - 14 Copyright 2013 Xilinx
Solution Configuration

Provide a solution name


Default is solution1, then solution2 etc.
Specify the clock
The clock uncertainty is subtracted from the clock
to provide an effective clock period
Vivado HLS uses the effective clock period for
Synthesis
Provides users defined margin for downstream
RTL synthesis, P&R
Select the part
Select a device family after applying filters such
as family, package and speed grade (see next
slide)

Using Vivado HLS 12 - 15 Copyright 2013 Xilinx

Selecting Part and Implementation Engine

Select the target part either through


Parts or Boards specify
Select RTL Tools
Auto
Will select Vivado for 7 Series
and Zynq devices
Will select ISE for Virtex-6 and earlier families
Vivado
ISE
ISE Design Suite must be installed and must be included
in the PATH variable

Using Vivado HLS 12 - 16 Copyright 2013 Xilinx


Clock Specification

Clock frequency must be specified


Only 1 clock can be specified for C/C++ functions
SystemC can define multiple clocks
Clock uncertainty can be specified
Subtracted from the clock period to give an effective clock period
The effective clock period is used for synthesis
Should not be used as a design parameter Clock Period
Do not vary for different results: this is
your safety margin
A user controllable margin to account
Clock Uncertainty
for downstream RTL synthesis and P&R Effective Clock Period
used by Vivado HLS
Margin for Logic Synthesis and
P&R

Using Vivado HLS 12 - 17 Copyright 2013 Xilinx

A Vivado HLS Project

Information Pane
Can view and edit any file from the
Project Explorer

Auxiliary Pane
Project Explorer Cross-referenced with the Information Pane
Project files displayed in a (here it shows objects in the source code)
hierarchal view

Console Pane
Displays Vivado HLS run time messages

Using Vivado HLS 12 - 18 Copyright 2013 Xilinx


Vivado HLS GUI Toolbar

The primary commands have toolbar buttons


Easy access for standard tasks
Button highlights when the option is available
E.g. cannot perform C/RTL simulation before synthesis

Create a new Project


Open Analysis Viewer

Change Project Settings


Compare Reports

Create a new Solution Open Reports

Export RTL
Change Solution Settings
Run C/RTL Cosimulation

Run C Simulation
Run C Synthesis

Using Vivado HLS 12 - 19 Copyright 2013 Xilinx

Files: Views, Edits & Information

Open file and it will display in


the information pane

The Auxiliary pane is context sensitive with respect to the


information pane

Here it displays elements in the code which can have directives


specified on them

Using Vivado HLS 12 - 20 Copyright 2013 Xilinx


Outline

Invoking Vivado HLS


Project Creation using Vivado HLS
Synthesis to IPXACT Flow
Design Analysis
Other Ways to use Vivado HLS
Summary

Using Vivado HLS 12 - 21 Copyright 2013 Xilinx

Synthesis

Run C Synthesis
Console
Will show run time information
Examine for failed constraints
A syn directory is created
Verilog, VHDL & SystemC RTL
Synthesis reports for all non-inlined
functions
Report opens automatically
When synthesis completes
Report is outlined in the
Auxiliary pane

Using Vivado HLS 12 - 22 Copyright 2013 Xilinx


Vivado HLS : RTL Verification

RTL output in Verilog, VHDL and


SystemC

Automatic re-use of the C-level test


bench

RTL verification can be executed from


within Vivado HLS

Support for Xilinx simulators (XSim


and ISim) and 3rd party HDL simulators
in automated flow

Using Vivado HLS 12 - 23 Copyright 2013 Xilinx

RTL Verification: Under-the-Hood

RTL Co-Simulation
Vivado HLS provides RTL verification
Creates the wrappers and adapters to re-use the C test bench
main.c(pp) main.c(pp)
A
DUT wrapper A
dut.c(pp) d d
Synthesis a
RTL
a
p p
t t
e e
r r

Prior to synthesis After synthesis


Test bench Test bench
SystemC wrapper created by Vivado HLS
Top-level C function
SystemC adapters created by Vivado HLS
RTL output from Vivado HLS
SystemC, Verilog or VHDL
There is no HDL test bench created
12- 24
Using Vivado HLS 12 - 24 Copyright 2013 Xilinx
RTL Verification Support

Vivado HLS RTL Output


Vivado HLS outputs RTL in SystemC, Verilog and VHDL
The SystemC output is at the RT Level
The input is not transformed to SystemC at the ESL
RTL Verification with SystemC
The SystemC RTL output can be used to verify the design without the need for a HDL simulator and
license
HDL Simulation Support
Vivado HLS supports HDL simulators on both Simulator Linux Windows
XSim (Vivado Simulator) Supported Supported
Windows & Linux ISim (ISE Simulator) Supported Supported
The 3rd party simulator executable must be in Mentor Graphics ModelSim Supported Supported
Synopsys VCS Supported Not Available
OS search path NCSim Supported Not Available
Riviera Supported Supported

12- 25
Using Vivado HLS 12 - 25 Copyright 2013 Xilinx

C/RTL Co-simulation

Start Simulation
Opens the dialog box
Select the RTL
SystemC does not require a 3rd party license
Verilog and VHDL require the appropriate simulator
Select the desired simulator
Run any or all
Options
Can output trace file (VCD format)
The SystemC simulation can always
Optimize the C compilation & specify test bench linker flags be run: no simulator license required!

The setup only option will not execute the simulation


OK will run the simulator
Output files will be created in a sim directory
Using Vivado HLS 12 - 26 Copyright 2013 Xilinx
Simulation Results

Simulation output is shown in the console


Expect the same test bench response
If the C test bench plots, it will with the RTL
design (but slower)
Sim Directory
Will contain a sub-directory for each RTL
which is verified
Report
A report is created and opened automatically

Using Vivado HLS 12 - 27 Copyright 2013 Xilinx

Vivado HLS : RTL Export

RTL output in Verilog, VHDL and SystemC

Scripts created for RTL synthesis tools

RTL Export to IP-XACT, SysGen, and Pcore formats

IP-XACT and SysGen => Vivado HLS for 7 Series


and Zynq families
PCore => Only Vivado HLS Standalone for all
families

Using Vivado HLS 12 - 28 Copyright 2013 Xilinx


RTL Export Support

RTL Export
Can be exported to one of the three types
IP-XACT formatted IP for use with Vivado System Edition (SE)
 7 Series and Zynq families only
A System Generator IP block
 7 Series and Zynq families only
Pcore formated IP block for use with EDK
 7 Series, Zynq, Spartan-3, Spartan-6, Virtex-4/5/6 families
Generation in both Verilog and VHDL for non-bus or non-interface based designs
Logic synthesis will automatically be performed
HLS license will use Vivado RTL Synthesis

12- 29
Using Vivado HLS 12 - 29 Copyright 2013 Xilinx

RTL Export: Synthesis

RTL Synthesis can be performed to evaluate the RTL project.prj

IP-XACT and System Generator formats: Vivado synthesis performed


solution1 solutionN
Pcore format: ISE synthesis is performed
impl syn sim

verilog vhdl ip sysgen pcore

RTL Synthesis Results IP Repositories

RTL synthesis results are not included with the IP package


Evaluate step is provided to give confidence
Timing will be as estimate (or better)
Area will be as estimated (or better)
Final RTL IP is synthesized with the rest of the RTL design
RTL Synthesis results from the Vivado HLS evaluation are not used

Using Vivado HLS 12 - 30 Copyright 2013 Xilinx


RTL Export: IP Repositories

Project Directory Solution directories


Top-level project directory There can be multiple solutions for each project. Each
IP can be imported (there must be one) solution is a different implementation of the same
project.prj (project) source code
into other Xilinx tools

solution1 solutionN

In Vivado :
1. Project Manager > IP Catalog
impl syn sim
2. Add IP to import this block
3. Browse to the zip file inside ip

ip sysgen pcore

In System Generator :
1. Use XilinxBlockAdd
2. Select Vivado_HLS block type In EDK :
3. Browse to the solution directory 1. Copy the contents of the pcore direcory
2. Paste into the EDK project pcore direcotry
3. Project > Rescan Local Repository

Using Vivado HLS 12 - 31 Copyright 2013 Xilinx

RTL Export for Implementation

Click on Export RTL


Export RTL Dialog opens
Select the desired output format

Optionally, configure the output


Select the desired language
Optionally, click on Evaluate button
for invoking implementation tools
from within Vivado HLS
Click OK to start the implementation

Using Vivado HLS 12 - 32 Copyright 2013 Xilinx


RTL Export (Evaluate Option) Results

Impl directory created


Will contain a sub-directory for each RTL which
is synthesized
Report
A report is created and opened automatically

Using Vivado HLS 12 - 33 Copyright 2013 Xilinx

RTL Export Results (Evaluate Option Unchecked)

Impl directory created


Will contain a sub-directory for both VHDL and Verilog
along with the ip directory
No report will be created
Observe the console
No packing, routing phases

Using Vivado HLS 12 - 34 Copyright 2013 Xilinx


Outline

Invoking Vivado HLS


Project Creation using Vivado HLS
Synthesis to IPXACT Flow
Design Analysis
Other Ways to use Vivado HLS
Summary

Using Vivado HLS 12 - 35 Copyright 2013 Xilinx

Analysis Perspective

Perspective for design analysis


Allows interactive analysis

Using Vivado HLS 12 - 36 Copyright 2013 Xilinx


Performance Analysis

Using Vivado HLS 12 - 37 Copyright 2013 Xilinx

Resources Analysis

Using Vivado HLS 12 - 38 Copyright 2013 Xilinx


Outline

Invoking Vivado HLS


Project Creation using Vivado HLS
Synthesis to IPXACT Flow
Design Analysis
Other Ways to use Vivado HLS
Summary

Using Vivado HLS 12 - 39 Copyright 2013 Xilinx

Command Line Interface: Batch Mode

Vivado HLS can also be run in batch mode


Opening the Command Line Interface (CLI) will give a shell

Supports the commands required to run Vivado HLS & pre-synthesis verification (gcc, g++, apcc, make)

12- 40
Using Vivado HLS 12 - 40 Copyright 2013 Xilinx
Using Vivado HLS CLI

Invoke Vivado HLS in interactive mode


Type Tcl commands one at a time
> vivado_hls i
Execute Vivado HLS using a Tcl batch file
Allows multiple runs to be scripted and automated
> vivado_hls f run_aesl.tcl
Open an existing project in the GUI
For analysis, further work or to modify it
> vivado_hls p my.prj
Use the shell to launch Vivado HLS GUI
> vivado_hls

12- 41
Using Vivado HLS 12 - 41 Copyright 2013 Xilinx

Using Tcl Commands

When the project is created


All Tcl command to run the project are created in script.tcl
User specified directives are placed in directives.tcl
Use this as a template from creating Tcl scripts
Uncomment the commands before running the Tcl script

Using Vivado HLS 12 - 42 Copyright 2013 Xilinx


Help

Help is always available


The Help Menu
Opens User Guide, Reference Guide and Man Pages

In interactive mode
The help command lists the man page for all commands
Vivado_hls> help add_files
Auto-Complete all commands using the tab
SYNOPSIS key
add_files [OPTIONS] <src_files>
Etc

Using Vivado HLS 12 - 43 Copyright 2013 Xilinx

Outline

Invoking Vivado HLS


Project Creation using Vivado HLS
Synthesis to IPXACT Flow
Design Analysis
Other Ways to use Vivado HLS
Summary

Using Vivado HLS 12 - 44 Copyright 2013 Xilinx


Summary

Vivado HLS can be run under Windows XP, Windows 7, Red Hat Linux, and SUSE OS
Vivado HLS can be invoked through GUI and command line in Windows OS, and
command line in Linux
Vivado HLS project creation wizard involves
Defining project name and location
Adding design files
Specifying testbench files
Selecting clock and technology
The top-level module in testbench is main() whereas top-level module in the design is
the function to be synthesized

12- 45
Using Vivado HLS 12 - 45 Copyright 2013 Xilinx

Summary

Vivado HLS project directory consists of


*.prj project file
Multiple solutions directories
Each solution directory may contain
impl, synth, and sim directories
The impl directory consists of pcores, verilog, and vhdl folders
The synth directory consists of reports, systemC, vhdl, and verilog folders
The sim directory consists of testbench and simulation files

Using Vivado HLS 12 - 46 Copyright 2013 Xilinx


Lab1 Intro
Vivado HLS Design Flow

Vivado HLS 2013.3 Version


ZedBoard

This material exempt per Department of Commerce license exception TSU Copyright 2013 Xilinx

Objectives

After completing this lab, you will be able to:

Create a project in Vivado HLS


Run C-simulation
Use debugger
Synthesize and implement the design using the default options
Use design analysis perspective to see what is going on under the hood
Understand and analyze the generated output

Lab1 Intro 12a- 2 Copyright 2013 Xilinx


The Design

This lab uses a simple matrix multiplication example to walk you through the Vivado
HLS project creation and analysis steps. The design consists of three nested loops.
The Product loop is the inner most loop performing the actual elements product. The
Col loop is the outer-loop which feeds next column element data with the passed row
element data to the Product loop. Finally, Row is the outer-most loop. The res[i][j]=0
(line 79) resets the result every time a new row element is passed and new column
element is used

Lab1 Intro 12a- 3 Copyright 2013 Xilinx

Procedure

Create a project after starting Vivado HLS in GUI mode


Run C simulation
to understand the design behavior
Run the debugger
to see how the top-level module works
Synthesize the design
Analyze the generated output using the Analysis perspective
Run C/RTL cosimulation
to perform RTL simulation
View simulation results in Vivado
to understand the IO protocol
Export RTL in the Evaluate mode and run the implementation

Lab1 Intro 12a- 4 Copyright 2013 Xilinx


Summary

In this lab, you completed the major steps of the high-level synthesis design flow using
Vivado HLS. You created a project, added source files, synthesized the design,
simulated the design, and implemented the design. You also learned that how to use the
Analysis perspective to understand the scheduling

Lab1 Intro 12a- 5 Copyright 2013 Xilinx

You might also like