Programmable Logic Devices
Tutorial 7
Michal Kubíček
Department of Radio Electronics, FEEC BUT Brno
Vytvořeno za podpory projektu OP VVV Moderní a otevřené studium techniky CZ.02.2.69/0.0/0.0/16_015/0002430.
Tutorial 7
❑ Clock signal distribution in FPGAs
❑ Clock management
❑ Slow clock signals, clock enabling
page 2 kubicek@vutbr.cz
Clock signals in FPGA
Clock signal distribution on
FPGAs
page 3 kubicek@vutbr.cz
Clock signals in FPGA
Clocking infrastructure
On FPGA there are many components related to the clock signal distribution and
management. Together they are called CLOCK RESOURCES.
❑ Dedicated network for clock signal distribution – clock tree (low-skew, high-
fanout). Large FPGAs feature several levels of clock distribution system (regional /
global).
❑ Buffers and multiplexers for clock signal inputs, signal conditioning, switching...
❑ Blocks for clock signal modification - Clock Management; based on PLL or DLL.
page 4 kubicek@vutbr.cz
Clock signals in FPGA
Clock tree
• Low skew, Low propagation delay: same (low) delay from a source buffer
(BUFG, BUFH...) to all destination nodes
• High fanout: capable of driving many nodes (thousands)
• Primarily for clock distribution but can be used for other high fanout signals, like
CLOCK ENABLE, SET/RESET....
KČ3
KČ1 REG KČ2 REG
clk clk
page 5 kubicek@vutbr.cz
Clock signals in FPGA
Example:
Spartan-3E
page 6
Clock signals in FPGA
Example:
Virtex-7
page 7
Clock signals in FPGA
Regional clock structure detail: Xilinx 7-series
Global and regional
buffers for clock signal
distribution
page 8
Clock signals in FPGA
IO clock structure detail: Xilinx 7-series
I/O buffers for clock
signal distribution
page 9
Clock signals in FPGA
IO clock structure detail: Xilinx 7-series
page 10 kubicek@vutbr.cz
How to use the clock resources
page 11 kubicek@vutbr.cz
Clock signals in FPGA
Clock signal resources
Clock for flip-flops: VHDL code inference
PROCESS (clk) BEGIN
IF rising_edge(clk) THEN
cnt <= cnt + 1;
END IF;
END PROCESS;
For clk signal the global clock network is automatically used.
page 12 kubicek@vutbr.cz
Clock signals in FPGA
Clock signal resources
Clock gating: VHDL code inference?
PROCESS (clk_250M, clk_enable) BEGIN
IF clk_enable = '0' THEN
clk <= '0'; -- stop clock (low power mode)
ELSE
clk <= clk_250M; -- normal operation
END IF;
END PROCESS;
Not reliable!!!
page 13 kubicek@vutbr.cz
Clock signals in FPGA
Clock signal resources
Clock MUXing: VHDL code inference?
PROCESS (clk_250M, clk_10M, set_high_speed) BEGIN
IF set_high_speed = '1' THEN
clk <= clk_250M; -- fast clock for processing
ELSE
clk <= clk_10M; -- slow clock for idle operation
END IF;
END PROCESS;
Not reliable!!!
page 14 kubicek@vutbr.cz
Clock signals in FPGA
Manual instantiation
Library UNISIM;
use UNISIM.vcomponents.all;
...
BUFGCE_inst : BUFGCE
port map (
O => O, -- Clock buffer output
CE => CE, -- Clock enable input
I => I); -- Clock buffer input
page 15 kubicek@vutbr.cz
Clock signals in FPGA
Manual instantiation
Library UNISIM;
use UNISIM.vcomponents.all;
...
BUFGMUX_inst : BUFGMUX
port map (
O => O, -- Clock MUX output
I0 => I0, -- Clock0 input
I1 => I1, -- Clock1 input
S => S); -- Clock select input
page 16 kubicek@vutbr.cz
Clock signals in FPGA
Clock resources (Xilinx 7-series)
Glitch-free clock switching (and more)
BUFGCTRL is designed to switch between two clock
inputs without the possibility of a glitch. When the
presently selected clock transitions from High to Low
after S0 and S1 change, the output is kept Low until the
other (to-be-selected) clock transitions from High to Low.
Then the new clock starts driving the output.
page 17 kubicek@vutbr.cz
Clock signals in FPGA
Clock signal input to the FPGA
Preferred method: use dedicated clock capable pins
Library UNISIM;
use UNISIM.vcomponents.all;
IBUFG_inst : IBUFG port map ( -- PIN input buffer
O => clk_50, -- Clock buffer output
I => clk_50_PIN ); -- Clock buffer input
IBUFGDS_inst : IBUFGDS port map ( -- diff. pair PIN input buffer
O => clk_sys, -- Clock buffer output
I => clk_in_P, -- Diff_p clock buffer input
IB => clk_in_N ); -- Diff_n clock buffer input
page 18 kubicek@vutbr.cz
Clock signals in FPGA
Clock input to FPGA
There are dedicated pins suitable for clock signal input (marked as GC, CC, MRCC, SRCC...).
They are able to connect signals directly to clock resources (BUFG, CMT...). But beware of many
rules and exceptions specific for each FPGA family!
Clock signals in FPGA
Clock input to FPGA
In some cases a violation of those tricky rules is not fatal. However, any solutions usually results
in some penalty that may cause problems during Static Timing Analysis (STA).
Library UNISIM;
use UNISIM.vcomponents.all;
IBUF_inst : IBUF port map ( -- PIN input buffer
O => clk_50_AUX, -- Buffer output
I => clk_50_PIN ); -- Buffer input
BUFG_inst : BUFG port map ( -- internal signal buffer
O => clk_50, -- Clock buffer output
I => clk_50_AUX ); -- Clock buffer input
clk_50_PIN clk_50
page 20 kubicek@vutbr.cz
Clock signals in FPGA
Clock input to FPGA
page 21 kubicek@vutbr.cz
Clock signals in FPGA
Clock Management
Blocks for clock signal
conditioning
CMT, DCM, PLL, DLL, MMCM...
page 22 kubicek@vutbr.cz
Clock signals in FPGA
Spartan-3: Digital Clock Manager
Clock signal conditioning (phase shifting, synthesis)
page 23 kubicek@vutbr.cz
Clock signals in FPGA
Spartan-3: Digital Clock Manager
page 24
Clock signals in FPGA
Use of DCM (signals reset and locked)
clk_50M_ibufg
clk_50M_pin clk_50M
clk_33M
IBUFG
DCM clk_100M
clk_250M
reset locked
page 25 kubicek@vutbr.cz
Clock signals in FPGA
7-series: Clock Management Tile (CMT)
Up to 24 CMTs in a single FPGA
page 26 kubicek@vutbr.cz
Clock signals in FPGA
7-series: MMCM block diagram
Mixed Mode Clock Manager
page 27 kubicek@vutbr.cz
Clock signals in FPGA
7-series: PLL block diagram
Phase Locked Loop
page 28 kubicek@vutbr.cz
Clock signals in FPGA
7-series: MMCM use case
page 29 kubicek@vutbr.cz
Clock signals in FPGA
7-series: MMCM use case
page 30 kubicek@vutbr.cz
Clock signals in FPGA
7-series: MMCM use case
page 31 kubicek@vutbr.cz
Clock signals in FPGA
Synchronous clock domains
clk_50M_ibufg
clk_50M_pin clk_50M
clk_33M
DCM clk_100M
clk_250M
reset Locked
page 32 kubicek@vutbr.cz
Clock signals in FPGA
Synchronous clock domains
clk_50M_ibufg
clk_50M_pin clk_50M
clk_33M
DCM clk_100M
clk_250M
reset Locked
page 33 kubicek@vutbr.cz
Clock signals in FPGA
Synchronous clock domains
clk_50M_pin
clk_50M
clk_100M
In this case the clk_50M a clk_100M clock domains are synchronous ➔ there is no need to
use synchronizers on these clock domain boundaries to transfer data or control signals. Static
timing analysis tool can correctly analyze all the necessary timing parameters.
page 34 kubicek@vutbr.cz
Clock signals in FPGA
Synchronous clock domains D Q D Q
clk_120M clk_100M
clk_100M T=10 ns
clk_120M T=8.33 ns
1.66 ns
In this case the clk_100M and clk_120M domains are also synchronous but because of specific
frequency difference there are situations where the timing budget is very tight (1.66 ns in this
case). This effectively requires usage of synchronizers in between these clock domains (they
must be treated as asynchronous).
The STA is considering all the possible edge delay combinations and requires the design to meet
the most strict one (worst case) to meet SETUP and HOLD requirements.
page 35 kubicek@vutbr.cz
Clock signals in FPGA
Zynq 7000: Clocking Wizard
VIVADO example
page 36 kubicek@vutbr.cz
Slow clock signals
page 37 kubicek@vutbr.cz
Slow clock signals
Why to use slow frequency clock (Hz, kHz)?
Significant saving of HW resources in naturally slow acting blocks:
❑ User interfaces – buttons, keyboards, LEDs, simple displays
❑ Slow communication interfaces – UART, SPI, I2C...
❑ ...
clk_slow
clk_fast
page 38 kubicek@vutbr.cz
Slow clock signals
How (not) to generate a slow clock
❑ Direct clock division using logic is not recommended:
clk
Ideal
clk_div
Real
clk_div
clk_slow_gen: PROCESS (clk) BEGIN
IF rising_edge(clk) THEN
D nQ clk_div clk_div <= NOT clk_div;
clk END IF;
END PROCESS clk_slow_gen;
page 39 kubicek@vutbr.cz
Slow clock signals
How (not) to generate a slow clock
❑ Direct clock division using logic is not recommended:
clk
Ideální
clk_div
Skutečné
clk_div
The delay of the clock dividing circuitry
D nQ clk_div causes the new clock to be asynchronous to
clk
the original one (the delay is unpredictable
and varies with each implementation).
page 40 kubicek@vutbr.cz
Slow clock signals
Derived clock signal distribution
Without special care the new clock signal is distributed using a general purpose
interconnect. This results in excessive skew and subsequent setup/hold time violations.
clk_slow_gen: PROCESS (clk) BEGIN
IF rising_edge(clk) THEN
clk_div <= NOT clk_div;
END IF;
END PROCESS clk_slow_gen;
D Q clk_div
clk
proc_cnt_slow: PROCESS (clk_div) BEGIN
IF rising_edge(clk_div) THEN
cnt_slow <= cnt_slow + 1;
END IF;
END PROCESS proc_cnt_slow;
page 41 kubicek@vutbr.cz
Slow clock signals
Derived clock signal distribution
Use a dedicated Clock Tree for clock signal distribution
The derived clock signal is routed through a
dedicated buffer (ex. BUFG fo Xilinx FPGAs).
Usually a structural description is used to
instantiate one.
Usage of the dedicated clock buffer driving the
clock tree ensures SKEW elimination.
The problem of asynchronicity of primary and
derived clock domain persists.
page 42 kubicek@vutbr.cz
Slow clock signals
Derived clock signal distribution
Use a dedicated Clock Tree for clock signal distribution
Library UNISIM;
use UNISIM.vcomponents.all;
BUFG_inst : BUFG -- internal signal buffer
PORT MAP(
O => clk_div_BUFGOUT, -- Clock buffer output
I => clk_div -- Clock buffer input
);
clk_div clk_div_BUFGOUT
page 43 kubicek@vutbr.cz
Slow clock signals
Derived clock signal distribution
Use a dedicated Clock Tree for clock signal distribution
clk_slow_gen: PROCESS (clk) BEGIN
IF rising_edge(clk) THEN
clk_div <= NOT clk_div;
END IF;
END PROCESS clk_slow_gen;
BUFG_inst : BUFG
PORT MAP (
O => clk_div_BUFGOUT,
I => clk_div );
!!!
clk_div BUFG clk_div_BUFGOUT
Slow clock signals
Derived clock signal distribution
Use a dedicated Clock Tree for clock signal distribution
❑ There is always a delay between the primary and the derived clock domain; this delay is
not well defined and may change with each implementation attempt.
❑ The SKEW is eliminated.
❑ Source of the derived clock signal must be a REGISTER, nevere a LUT or other
combinatorial block (as their output may contain glitches).
The delay between primary an derived clock domain may cause SETUP / HOLD timing problems on
signals crossing clock domain boundary ➔ synchronizers are a must!
There is no problem with a large logic load (large fan-out) of the clock net as the global clock
tree is designed for that.
page 45 kubicek@vutbr.cz
Slow clock signals
A better solution?
❑ Clock enabling – any lower frequency can be used (with a resolution of
primary clock period).
❑ Clock Management – using a dedicated clock conditioning blocks
available in FPGAs. Several clock signals with different frequencies can be derived.
The lowest frequency is usually limited to a frequency of about 1 to 10 MHz (the
clock conditioners are partially analog circuits based on PLL or DLL).
❑ Combination of the Clock Management and the Clock enabling techniques.
page 46 kubicek@vutbr.cz
Clock Enabling
page 47 kubicek@vutbr.cz
Slow clock signals
Clock Enabling D Q
All the Flip-Flops in the design (even those that should run on a slow CE
clock) share a common clock signal (usually of a relatively high
frequency). Switching of the Flip-Flops can be enabled/disabled (slow
down) using a dedicated Clock Enable signal.
Benefits: less clock domains, less synchronizers
clk
CE
page 48 kubicek@vutbr.cz
Slow clock signals
Clock Enabling
0
D 1 D Q D D Q
Clock Enable CE
Clock Enable clk
clk
Flip-Flops in most FPGAs feature a
Typical implementation of the CLOCK dedicated CE input ➔ no additional
ENABLE (CE) functionality. hardware (LUTs, routing) is needed for
the CE functionality.
page 49 kubicek@vutbr.cz
Slow clock signals
Clock Enabling: generate and use the CE Signal
clk
Main (system) clock signal 125 MHz
clk_EN
1:5 => 1/5 * 125 MHz = 25 MHz
clk_EN_gen: PROCESS (clk) BEGIN slow_proc: PROCESS (clk) BEGIN
IF rising_edge(clk) THEN IF rising_edge(clk) THEN
IF cnt_div = MAX THEN IF clk_EN = '1' THEN
cnt_div <= (OTHERS => '0'); ...
clk_EN <= '1'; END IF;
ELSE END IF;
cnt_div <= cnt_div + 1; END PROCESS slow_proc;
clk_EN <= '0';
END IF;
END IF;
END PROCESS clk_EN_gen;
page 50 kubicek@vutbr.cz
Slow clock signals
Clock Enabling
clk
Main (system) clock signal 125 MHz
EN_1
1:1 => 1/2 * 125 MHz = 62.5 MHz
EN_2
1:3 => 1/4 * 125 MHz = 31,25 MHz
EN_3
1:4 => 1/5 * 125 MHz = 25 MHz
page 51 kubicek@vutbr.cz
Slow clock signals
Wrong technique of using clock enable
Clock Gating in VHDL code ➔ combinatorial logic in the clock signal path
clk_switch: PROCESS (clk_EN, clk_in) BEGIN
IF clk_EN = '1' THEN
clk_sys <= clk_in;
ELSE
clk_sys <= '0';
END IF;
END PROCESS clk_switch;
D Q
clk_EN
Not for FPGAs!
clk_in D Q
0 clk_sys
page 52 kubicek@vutbr.cz
Slow clock signals
Allowed usage of CE
For Clock Gating it is necessary to use a dedicated glitch-free clock buffer with
enable input (not available in all FPGAs).
Library UNISIM;
use UNISIM.vcomponents.all;
...
BUFGCE_inst : BUFGCE
port map (
O => clk_sys, -- Clock buffer output
CE => clk_EN, -- Clock enable input
I => clk_in); -- Clock buffer input
page 53 kubicek@vutbr.cz
Slow clock signals
❑ Instead of generating very slow signals it is often much more efficient to use a
small microcontroller (a soft IP core) that can run on a relatively high clock
frequency (as the rest of the design).
❑ Any slow actions (delays) are software defined with no additional HW cost.
❑ It is way easier to write/modify/debug software (C or even Assembler) than
hardware (VHDL, Verilog) ➔ faster development.
❑ Once used the microcontroller can often adopt other task (especially a more
complex algorithmic ones) to offload the logic.
❑ The use of a microcontroller for such tasks usually results in a significant saving
of hardware resources (LUTs, Flip-Flops)
page 54 kubicek@vutbr.cz
Thank You for Your Attention!
Routing congestion analysis