RTL S &s With Plds Digital Notes
RTL S &s With Plds Digital Notes
RTL S &s With Plds Digital Notes
WITH PLDS
Lecture Notes
M.TECH(VLSI & ES)
(I YEAR – I SEM)
(2022-2023)
Prepared by
Dr.Anup Dey, Professor
Course Objectives:
Unit I
High-Level Design Methodology Overview: ASIC Design Flow Using Synthesis, HDL Coding, RTL Behavioral
and Gate-Level Simulation, Logic Synthesis, Design for Testability, Design Re-Use, Behavioral Synthesis &
Concepts. Design Analyzer and Design compiler, Target Library, Link Library, and Symbol Library, Cell
names, Instance names, and VHDL Libraries in the Synthesis Environment, Synthesis, Optimization and
Compile, Classic Scenarios
Unit II
VHDL/Verilog Coding for Synthesis: General HDL Coding Issues, VHDL vs. Verilog: The Language Issue,
Finite State Machines, HDL Coding Examples, Classic Scenarios.
Unit III
Links to Layout, Motivation for Links to Layout Floor planning, Link to Layout Flow Using Floorplan
Manager, Creating Wire Load Models After Back-Annotation Re-Optimizing Designs After P&R. Design for
Testability: Introduction to Test Synthesis, Test Synthesis Using Test Compiler
Unit IV
Constraining and Optimizing Designs: Synthesis Background, Clock Specification for Synthesis, Design
Compiler Timing Reports, Commonly Used Design, Compiler Commands, Strategies for Compiling Designs,
Typical Scenarios When Optimizing Designs, Guidelines for Logic Synthesis, Classic Scenarios.
Unit V
Constraining and Optimizing Designs for FSM: Finite State Machine (FSM) Synthesis, Fixing Min Delay
Violations Technology Translation, Translating Designs with Black-Box Cells, Pad Synthesis, Classic
Scenarios
Text Book
1. Kurup Pran, Taher Abbasi, Logic Synthesis using Synopsys, 2/e, Pearson Education, 2007.
References
1. VHDL for Logic Synthesis, Third Edition. Andrew Rushton. © 2011 John Wiley & Sons, Ltd. Published
2011 by John Wiley & Sons, Ltd.
2. Weng Fook Lee, VHDL Coding and Logic Synthesis with Synopsys, Academic Press, 2000
3. Morris Mano, Michael D. Ciletti, Digital Design , 4/e, Prentice Hall of India, 2008
4. Himanshu Bhatnagar, Advanced ASIC Chip Synthesis, Springer Science, 2013
Course Outcomes:
ASIC Design Flow Using Synthesis, HDL Coding, RTL Behavioral and
Gate-Level Simulation, Logic Synthesis, Design for Testability, Design Re-
Use, Behavioral Synthesis & Concepts. Design Analyzer and Design
compiler, Target Library, Link Library, and Symbol Library, Cell names,
Instance names, and VHDL Libraries in the Synthesis Environment,
Synthesis, Optimization and Compile, Classic Scenarios
Introduction
In today's world, faster and less costly ASIC chips are being designed at a much
quicker rate than before. ASIC designers are able to design much more efficiently
than before.Designers are constantly under pressure to come up with faster
performing designs, but with fewer resources.This has led to the development of
many EDA tools that help designers to complete a design in a much shorter time
frame. These EDA tools are based on the concept of designing ASIC components
utilizing Hardware Description Language (HDL). Today, a designer does not need
to spend much time manually drawing the circuitry involved in a design but instead
can write synthesizable HDL code. A common form of HDL code used in the
ASIC industry for synthesis is Very High-Speed Integrated Circuit Hardware
Description Language (VHDL) and Verilog.
Definition
RTL Simulation
Logic synthesis
The process of translating and mapping RTL code written in HDL (such as
Verilog or VHDL ) into technology specific gate level representation is logic
synthesis.Its a process by which an RTL model of a design is automatically turned
into a transistor-level schematic netlist by a standard EDA tool.Abstract
specification of desired circuit behavior, typically at Register transfer
level (RTL), is turned into a design implementation in terms of logic gates, by
a computer program called a synthesis tool.Common examples include synthesis of
designs specified in Hardware Description Languages.Logic synthesis is one
aspect of electonic design automation.
The above seven steps are usually iterative as shown in Figure. For
example, on performing a functional simulation of the source HDL code, one
might find that the code does not exactly match the desired functional
behavior. In such cases, one must return to modify the source code.
Also, after synthesis, it is possible that the netlist does not meet the timing
requirements ofthe clock. This implies that one must either modify the source
HDL, or attempt alternate synthesis strategies.Similarly, after performing place and
route, it is required to back annotate delay values to incorporate real-world
delays. This is followed by in-place optimization (lPO) of the netlist to meet
routing delays.
After a block level schematic capture of the design, the next step involves
HDL coding. The style of HDL coding often has a direct impact on the
results the synthesis tool delivers. A sound knowledge of the working of the
synthesis tool will help the designer write synthesizable code better. For
example, one common problem arises due to partitioning of designs.
Typically, designers partition designs based on functionality. During the
integration of different modules in synthesis, one might find a large amount of
logic in the critical path. This critical path most often traverses several
hierarchical boundaries. In a typical design team scenario, these blocks are
usually designed by different engineers, thereby compounding the problem.
Logic synthesis provides the best results when the critical path lies in one
hierarchical block as opposed to traversing multiple hierarchical blocks. In
such situations, it is often required to modify the hierarchy in the source HDL
code and re-optimize the design or modify the design hierarchy through
Synopsys tool specific scripts.
Example 1:
D Flip flop
VHDL Code
Verilog Code
Example shows VHDL and Verilog code which when synthesized infers
a positive edge triggered flip flop. If one desires a negative edge triggered flip
flop, the obvious change to make would be to replace the posedge declaration
in the code with negedge in the Verilog code (or clock=' l' by clock='O' in
VHDL). While this might appear to be an obvious solution, inferring a
negative edge triggered flip flop is largely dependent on an appropriate
library cell being available.
Example 2
AND gate
VHDL
Verilog
Above example infers an AND gate when synthesized. In this case, the
HDL code exactly matches the logic inferred. In general, to infer an AND gate, the
recommended coding style for synthesis is using (out = a&b;) instead of the if
statement.
If one has another file of expected results, one can quite easily compare
the two files to ensure that the two match.
In order to simulate a synthesized gate-level netlist, VHDL simulation models
of the technology library cells are required. These can be of three kinds - unit
delay structural model (UDSM), full-timing structural model (FTSM), fulltiming
behavioral model (FTBM) or full-timing optimized gate-level
simulation (FTGS). While UDSM and FTSM are used for functional verification,
the FTBM is used for accurate, detailed timing verification and
FTGS library for fast, sign-off-quality timing verification.
Logic Synthesis
Synthesis, as referred to in present-day IC design, can be
broadly divided into logic synthesis and high-level synthesis.
1. Full scan
2. Partial scan
If all the sequential cells are replaced by scan cells, then it is called a Full
Scan test methodology. In this case, the Automatic Test Pattern Generation
(ATPG) algorithm is combinational.
Design Re-Use
Several design houses rely on re-use of large blocks of designs when
building newer versions of existing chips. In some cases, it might involve re-
targeting an existing design to a new technology library. For others, it might
involve minor tweaks to existing designs. In general, the strategy used to realize
these changes has a significant impact on the turn around time.
Design re-use is an
effective means to achieve fast turn around on complex designs. A clear
advantage in time is gained by re-use of designs from a library of parts rather
than designing from scratch. The upcoming generation of complex systems
will require a widespread availability of re-usable parts. Synthesis provides a
very efficient and flexible mechanism to build a library of re-usable
components. This is essentially the Synopsys DesignWare (DW) approach
Internal to the DC, these libraries are referenced as DWOI, DW02 and
DW03 respectively. In addition, there exists the generic GTECH library.
When a source HDL is read into DC, the design is converted to a netlist of
GTECH components and inferred designware parts.
The GTECH library, like the DW libraries, is a technology independent
library that aids users develop technology independent parts. The GTECH
library called “gtech.db" contains common logic elements such as basic logic
gates and flip flops. In addition, the gtech.db also contains a half adder and a
full adder.
should invoke the DC and show the dc_shell prompt. If not running on a
sparc, then use the corresponding architecture. A similar prompt can be seen
from the Design Analyzer command window. The command window can be
invoked from the Design Analyzer from the Setup -> Command Window
pull down menu.
When using the Design Analyzer, the command window helps the user
understand the commands executed when using the menus in the DA. The
DA, in tum, can be invoked by typing the following:
Startup Files
The DC, when invoked, reads the .synopsys_dc. setup file. The synopsys
directory tree has a system wide .synopsys_dc. setup file. This file is located
in $SYNOPSYS/admin/setup directory. In addition to this system wide file,
the user can have a local .synopsys_dc.setup file in the current working
directory or in the home directory.
Example
DC follows the paths in the search_path variable from left to right. For
example, if a link_library file, link.db exists in the lib directory and in the
vhdl directory, then the link.db file found in the lib directory is used.
If the libraries are assigned correctly and the search_path indicates the
location of these files, then, on invoking the DA, the Setup -> Default pull down
menu should indicate the specified target, link and symbol libraries.
Target library is the ASIC vendor library whose cells are used to generate a
netlist for the design described in HDL during synthesis. The HDL code is
“mapped" to cells from this library.
Example :
Example shows a simple Verilog netlist written out by the DC. This example
should help to better understand the link library concept.
The netlist has four instances (U5, U6, U7, US) of the IVA library cell. If one
wished to read the above netlist into DC and execute a command like
report_timing, then, the link library declared by the link_library variable
must have a description for the IVA library cell. If DC is unable to find a
description for the IVA cell in the link library, the tool will be 'unable to
resolve the reference IVA".
For example, on reading the above netlist into DC, followed by the link
command, the tool looks for the IVA cell in the link library to 'resolve the
reference" IVA. Once it finds IVA in the link_library, the tool then looks for the
IVA symbol in the symbol library. Viewing the schematic in the DA can be done
by double clicking on the design icon. However, the equivalent command
executed is the create_ schematic command. If it is unable to find cells in the
symbol library, DA uses the generic symbol library (generic.sdb) to create
schematics.
The technology and symbol libraries must exactly match in “case"for the
cell names and their pin names. In other words, if the pin names ofthe IVA cell in
the technology library do not match the pin names of the cell IVA in the
symbol library, the tool will not be able to use the IVA symbol. In such a
scenario, the tool uses symbols from the synopsys default generic library.
The compare_lib command in DC, is a fast check mechanism to determine
any differences between the symbol library and the technology library that might
exist.
The target and link libraries are of .db extension while symbol_libraries are of
.sdb extension. Technology libraries are generated by the Synopsys Library
Compiler from .lib files. These in tum are text files created by the ASIC
vendor in Synopsys Library Compiler syntax. The ASIC vendor provides the
user with .db and the .sdb files.
In DC terminology, cell names and instance names are the same. For
example,if a design uses an IVA library cell, then, the tool provides it
with an instance name (or cell name) such as V6. IVA is the reference and VI
is an instance of the IVA reference, which in tum is a library cell.
VHDL Example
Verilog Exanple
The report_reference shows just one reference, while the report_cell shows
four instances or cells.
The VHDL language supports libraries. That is, frequently used functions,
and component declarations are stored in packages and these packages are
analyzed into libraries. The packages are then called via the “use" clause in
VHDL. A package must be analyzed prior to being used in a another design.
The package can be a part ofthe VHDL code or a separate VHDL file.
If the package is a separate file, then it must be analyzed prior to being used
in a design. In general, it is recommended that one maintain separate package
files and declare them using the “use" clause when required in VHDL design
files. Since Verilog does not have a configuration management mechanism
like VHDL, this is not applicable to Verilog
Example shows the dc_shell script that maps the States VHDL library
to the UNIX directory 'lib" in the current working directory. On executing
the analyze command, the intennediate files are placed in the “lib" directory.
This is extremely useful because it prevents the working directory from being
cluttered with files. If the analyze command was used without the -lib States
option, then by default, the intermediate files are written to the work library.
The work library is mapped to the current working directory by default. To
over-ride this default, one must use the define_design_lib command to map
the work library to a particular unix directory. In general, it is recommended
that one create a directory called “Work" in the current working directory and
map the “work" library to it. To verify the above steps, execute the
report_design_lib command at the dc_shell prompt:
Synthesis is the the process of achieving an optimal gate level netlist from
HDL code. Therefore, synthesis includes both reading in the source HDL and
optimization of the code.
Classic Scenarios
Case 1: You are linking a design and DC issues one of these warnings:
If it is a library cell, then the target technology library or the link library
must have a description for it. If it is a sub design, then the source HDL for
the sub design must be read into DC, prior to reading in this file. If a db file
exists for this sub-design, then the search path must include a path to this db
file.
Case 3: DC issues the following error on reading in the source VHDL into
DC. The library declaration in the VHDL file is as shown below.
Solution: This Error occurs when the VHDL design library ''test'' is not
mapped to a valid unix directory. Use the following command at the dc_shell
prompt
Case 6: When reading in the design database (db file), DC issues the
following error. What could be the cause?
Error: db file is corrupted. (EXPT-18)
Solution: It likely that you are using different versions of Synopsys tools. In
other words, the db file was generated in version 3.1b of the DC while you
are trying to read the db file now into v3.0b. In short, db files are backward
compatible but not forward compatible. 3.Ox generated db files can be read
into subsequent versions ofthe DC like 3.lx, but not vice-versa. To check the
version of DC, use the following command:
Case 7: In DA, the schematic shows mere boxes instead of the actual
symbols for gates.
Solution: Ensure that the symbol library (.sdb file extension) for the
technology library is available and specified by the symbol_library variable.
Also, verify that the search_path variable in your .synopsys_dc. setup file includes
the path to where the symbol library file is located. After doing the
above, execute the following steps to verify:
If DA still shows boxes instead of actual symbols, it is likely that the symbol
library and the technology library do not match in case for the pin names or
cell names.
Unit II
VHDL/Verilog Coding for Synthesis
General HDL Coding Issues, VHDL vs. Verilog: The Language Issue, Finite
State Machines, HDL Coding Examples, Classic Scenarios.
Introduction
The design issues in the coding of state machines like state encoding,
registered outputs, synchronous resets, asynchronous resets and ''fail-safe''
behavior of state machines are crucial for effective synthesis of state
machines. The design and synthesis of clocked synchronous
state machines using the DC is discussed.
Mealy Machines
A sequential state machine whose outputs depend on both the current state and the
inputs is called a Mealy machine.
Functionality can be expressed as,
Next state (N) = function [current state (P), Inputs (I)]
Outputs (0) =function [current state (P), Inputs (I)]
Moore machine
A Moore machine is one in which the outputs are a function of only the current
state and independent of the inputs
State Encoding
The concepts of current state and next state are vital to any state machine.
Flip flops in a state machine serve as memory elements keeping track of the
current state. Each possible state of the state machine can be assigned a unique
binary code. This is called state encoding.
At any given instant, the current state of the state machine is determined by
the binary values in the flip flops and their corresponding encoding. Thus n flip
flops will encode a maximum of 2n states. Alternatively, one can assign one flip
flop to each state. Thus n flip-flops will represent n states. This is called the One-
hot method of encoding. Since the state machine can only be in one state at a given
time, the outputs of only one of the flip flops is true and hence the name One-hot.
The use of one flip flop for each state could result in greater silicon area.
When X =0, the current state of the state machine remains unchanged and output Z
remains at O.
When X =1, the state machine makes a transition from one state to the next binary
state i.e.., 00 -> 01 -> lO -> 11 -> 00...
The output Z is equal to 1, only when the state is 11 and the input X is equal to 1,
else Z is equal to 0 as shown in the state transition table and state transition
diagram.
Coding Example 1:
Coding Example 2:
VHDl Example
Verilog example
Example shows another approach to coding the same state machine. This form of
coding tells DC that the design is a state machine without having to set the state
vectors after reading in the design. This is possible by use of the state_ vector_
attribute.The state_vector attribute on the architecture is
assigned a value which is the name of the state signal. The design has been
divided into two separate processes. The first process COMB, describes the
combinational part of the design, while the second process SYNCH,
describes the synchronous part of the design.
Registered Outputs
Coding Example 3:
VHDL Example
Verilog Example
Coding example 3 shows the use of enumerated types with the use of the
enum_encoding attribute. The declaration, "type state is (SO, Sl, S2, S3),: defines
the list of all possible values of the type state. By default, the enumeration values
are encoded into bit_vectors, the first enumerated literal SO being assigned 0, the S
I being assigned a 1 and so on. By using the enum_encoding attribute, the encoding
of the different states is declared in the code. This approach to coding FSMs is the
recommended approach. The combinational and sequential parts are in separate
processes. Further, by the use of enum_encoding, one has control over the states
and their encodings.
One-hot Encoding
The fastest FSM implementation is the one-hot method of encoding .In DC,
the user will have to declare the FSM encoding style using the
set_fsm_encoding_style command.
Coding Example 4:
VHDL Example
Verilog Example
Coding example 4 shows the same state machine described in example 1
using both the state_vector attribute and the enum_encoding attribute.
Enum_encoding has been used such that the first flip-flop, when on, implies the
state SO, the second flip-flop implies state SI, and so on. In general, the one-hot
encoding style involves the use of one flip-flop for each state, the current state
being determined by the flip-flop which is on.
Discussed some basic issues related to HDL coding for synthesis such as
VHDL types, unwanted latches, variables and signals and priority encoding. For a
certain desired functionality, it is often possible to code HDL in a number of
different ways. However, there are several guidelines that one can follow to
develop a consistent coding style for synthesis.
VHDL Types
It is recommended that std_logic types be used for port declarations in the
entity. This convention avoids the need for conversion functions when integrating
different levels of synthesised netlist back into the design hierarchy. The std_logic
type is declared the IEEE std_logic_1164 package. Some of the examples in this
chapter use the type 'bit" for the sake of simplicity and easy understanding.The
type 'buffer" can be used when an output must be used internally.
Once declared as a buffer, all references to the particular output port must be
declared as buffer throughout the hierarchy. This is often overlooked and one can
run into problems during integration of different blocks. For the sake of
consistency, it is recommended that one avoid the use of the type buffer.
Coding example shows an effective way to avoid the use of buffer types
using internal signal declarations.
Unwanted latches
Ensure that all signals are initialized. Further, when using case statements
or nested if statements ensure that they are fully defined. A full specification will
prevent latches from being inferred.
VHDL Example
Verillog Example
In the above example,notice that the expected result when clk is not equal
to 'I' is not specified. DC interprets this to mean that 'When clk=1 condition is not
satisfied, retain the previous value of q': Hence,latches are inferred. After reading
in the HDL, one does not have to compile the design to realize that unwanted
latches have been inferred.
Motivation for Links to Layout Floor planning, Link to Layout Flow Using
Floorplan Manager, Creating Wire Load Models After Back-Annotation Re-
Optimizing Designs After P&R. Design for Testability: Introduction to Test
Synthesis, Test Synthesis Using Test Compiler
Floor planning
Today's increasingly large and complex digital integrated circuit (IC) and
system-on-chip (SoC) designs often contain tens of millions of logic gates.
Ensuring that these designs will function as required demands the use of chip-level
floorplanning.
Stage 1: First of all we have the system architects who partition the design into
functional blocks, associate estimated gate-counts and areas with these blocks, and
establish an initial floorplan with associated chip-level and block-level timing
constraints.
Stage 2: Next we have a group RTL design engineers, each of whom deals with
their own block. Each RTL block will eventually equate to around 400K gates.
Once a block of RTL has been created, InTime contends that synthesis and IPO
will take about 7 hours followed by 3 hours to perform timing analysis and
generate a timing report (about 10 hours total).
Stage 3: Finally, we have the system integrators who take all of the blocks from
the RTL engineers along with their more accurate gate-count, area, and timing
values — use these to generate a more accurate floorplan, and then use this to
perform chip-level synthesis/IPO and timing analysis.
Using implementation tools to perform these activities makes the cycle times
through conventional flows too long and problematic.
In its simplest form, RTL floorplanning refers to the ability to take RTL that is
ready for synthesis following functional signoff, and to use this RTL to provide
gate-count, area, and timing estimations that are sufficiently accurate to perform
meaningful floorplanning analysis.
In order to satisfy the requirements for RTL floorplanning, InTime has two related
applications called Time Planner and Time Director. Time Planner is used by
system architects and system integrators to perform chip-level floorplanning and
analysis functions. Time Director is used by RTL design engineers to provide gate-
count, area, and timing estimations at the block level.
Stage 1: The flow starts as the system architect uses Time Planner to establish the
initial floorplan and to generate the associated chip and block-level timing budgets
and constraints. At this stage of the process, Time Planner will accept gate-count,
area, and timing estimates for blocks for which RTL is not yet available, and it will
generate gate-count, area, and timing predictions for blocks whose RTL is
available.
Stage 2: As for a conventional flow, the RTL design engineers create and
functionally verify the RTL corresponding to their blocks. In the conventional
flow, however, the engineers would now have to run compute-intensive and time-
consuming implementation tools that take about 10 hours per iteration in order to
obtain accurate timing information.
After the circuit partitioning phase, the area occupied by each block
(subcircuit) can be estimated, possible shapes of the blocks can be ascertained and
the number of terminals (pins) required by each block is known. In addition, the
netlist specifying the connections between the blocks is also available. In order to
complete the layout, we need to assign a specific shape to a block and arrange the
blocks on the layout surface and interconnect their pins according to the netlist.
The arrangement of blocks is done in two phases; Floorplanning phase, which
consists of planning and sizing of blocks and interconnect and the Placement
phase, which assign a specific location to blocks. The interconnection is completed
in the routing phase. In the placement phase, blocks are positioned on a layout
surface, in a such a fashion that no two blocks are overlapping and enough space is
left on the layout surface to complete the interconnections.
The blocks are positioned so as to minimize the total area of the layout. In
addition, the locations of pins on each block are also determined. The input to the
Floorplanning phase is a set of blocks, the area of each block, possible shapes of
each block and the number of terminals for each block and the netlist.
If the layout of the circuit within a block has been completed then the
dimensions (shape) of the block are also known. The blocks for which the
dimensions are known are called fixed blocks and the blocks for which dimensions
are yet to be determined are called flexible blocks. Thus we need to determine an
appropriate shape for each block (if shape is not known), location of each block on
the layout surface, and determine the locations of pins on the boundary of the
blocks. The problem of assigning locations to fixed blocks on a layout surface is
called the Placement problem. If some or all of the blocks are flexible then the
problem is called the Floorplanning problem. Hence, the placement problem is a
restricted version of the floorplanning problem. If one asks for planning of the
interconnect in addition to floorplanning, then it is referred to as the chip planning
problem . Thus floorplanning is a restricted version of chip planning problem.
RTL simulators provide information about a chip’s speed but are more
useful for functional verification and do not give the type of design-planning
information. Designing at an RTL and a gate level are very different. RTL-design
descriptions, such as the Verilog example in Figure a, include logic operation on a
clock-cycle basis along with an implied design architecture. A logic-synthesis tool
takes the RTL description and converts the design to a gate-level description
(Figure b). Synthesis preserves the architecture and attempts to meet user-defined
constraints, such as area and timing, in the gate-level description.
Although statistical wire-load models may have been adequate for most
designs greater than 0.5 mm, with deepsubmicron processes at 0.35 mm and
smaller, these models are inaccurate. After you physically implement a design with
place-and-route tools, the resulting logic may have very different timing
characteristics, resulting in either a waste of silicon or a design that fails to meet
timing requirements. The former problem wastes money; the latter definitely
means redesign, resynthesis, and another place-and-route run.
` In addition, RTL estimation helps you decide which cell library to use for
your design. You can supply information to the logic-synthesis tool that can help
achieve timing convergence and minimize synthesis and place-and-route iterations.
Finally, you can get an estimate of chip size for a specific process technology that,
along with speed and power estimates, helps you decide which chip package to use
and gives an indication of chip cost.
Back-Annotation
The DC can both read in SDF as well as output SDF. The SDF file written out
from the DC can be read into Synopsys VSS (and other simulation tools) for
simulation. Shown below are the steps to write SDF from DC after one has
initially read in the source VHDL.
If we are using a Verilog simulator, then one can write out SDF without
any name changes, and the Verilog netlist from DC for simulation. Also, after
place and route, if one can generate layout delays in SDF, DC provides a
means to back-annotate the SDF information. This can be done using the
read_timing command as shown below.
After the floorplanning is complete, and before the design is passed tophysical
layout (place and route), the design's timing behavior can be verified once more
within the synthesis environment. This time, the more accurate net parasitics
(capacitance), net delays, and cell delays are used in place of the values estimated
by DC. The estimates (provided by a floorplanner) for the wire capacitance, net
delays, and cell delays can be back-annotated into DC.
Scan Styles
The four commonly used scan styles are as follows:
1. Multiplexed Flip-flop
2. Level Sensitive Scan Design
3. Clocked Scan Cell
4. Auxiliary LSSD cell
Multiplexed Flip-flop
Figure shows a multiplexed flip-flop scan cell. Consder the multiplexed flip-flop
scan style in which this scan style is supported by most ASIC vendors. For a
multiplexed flip-flop scan style the scan ports required are the scan-in, scan-enable
and the scan-out ports. The normal clock is used in the test mode in this scan style.
Scan Insertion
Scan cells have two different modes of operation : The Normal Mode and The Scan
Mode. In the normal mode, the scan cell's functionality is same as that of the
sequential non-scan cell. In the scan mode, the scan cells are linked in the form of
a shift register.
When scan cells are linked to form a scan chain as shown in above Figure
,all the scan cells are controllable and observable. Since shifting of data into the
scan chain is perfonned serially, it takes N clock cycles to shift in a pattern into the
scan chain, where N is the maximum length of the scan chain.Configurations with
multiple scan chains are supported by most synthesis tools.
Scan insertion results in design overheads such as, the use of extra scan
ports, an increase in silicon area due to use of scan flops, and greater timing delays
due to the insertion of the scan cells for the sequential non-scan cells.It is possible
to reduce the port overheads by sharing the scan ports with functional ports.
After inserting the scan logic in the design, the ATPG algorithm is used to
generate test patterns. Full scan ATPG algorithm is combinational, while the
partial scan ATPG algorithm is sequential. Test patterns can be generated in the
format supported by the simulator used to simulate the test vectors. The common
test fonnats are: VHDL, Verilog, and TSSI.
Some of the critical issues which involve the ASIC vendor are
1. What scan style does the ASIC vendor support ?
ASIC vendors usually support only some scan styles and not all the available scan
styles. Most ASIC vendors support the multiplexed scan flip-flop style.
2. How many clocks are supported by the tester when in test mode? Is there a limit
on the number of wavefonns supported by the tester?
3. Is there a limit on the number of scan chains allowed ?Most ASIC vendors
impose a restriction on the number of scan chains. Is there a limit on the length
ofthe scan chain?
4. Is sharing of functional ports with test_sean_in and test_sean_out ports
supported?
5. Does the vendor require that during scan-shift all the outputs have no switching
i.e. all outputs are three-state outputs.
6. Does the vendor library support automatic pad synthesis?
7. What is the format of the test vectors required by the vendor?
8. Does the vendor accept parallel vectors or serial vectors for sign-off simulation ?
9. What is the maximum number of scan bits supported by the vendor's tester?
The total number of scan bits is simply the number of scan vectors multiplied
by the number of flip-flops in the scan chain.
10. Do the formatted vector files require a specific naming convention ?
11. Is there a limit on the size ofthe vector files ?
check_test: This command infers a default test protocol and performs a DRC check
by simulating the test protocol. One must execute the check_test command before
scan insertion as well as after scan insertion.
create test clock : This command is similar to the create clock command for the
DC. TC automatically infers clocks during check_test by backtracking from the
clock pins of registers. The create_test_clock command is used to specify the
waveform and clock period in the test mode.
insert_test : The insert_test command replaces the non-scan sequential cells with
scan equivalent cells and connects the scan cells to form a scan chain.
create_testyatterns : This command is used to generate the test patterns for the
specified design. The command also writes out a .vdb file in the current working
directory.
Identifying Scan Ports
The TC uses the signal_type attribute to identify scan ports. Functional ports
can be identified as scan ports, by assigning this attribute using the set_signal_
type command. The TC creates scan ports automatically if no functional ports are
identified with the signal_type attribute. In the muxed flip-flop scan style, where
normal clock is used as test clock, one must not associate a signal_type attribute,
"test_clock" with the clock port.
2. Set your current_design to the top level and specify the test methodology and
scan-style.
3. Another simpler alternative is to use the 'Test Smart Compile" approach. In this
approach the user must specify the scan style before compile. The Test-Smart
compile is turned on by specifying both the scan style (using set_scan_style
command) and the test methodology (using set_test_methodology command)
before compile.
6. Analyze the testability of the design prior to scan insertion using the check_test
command. A default test protocol is inferred and simulated on executing the
check_test command.
8.Set the current_design to each of the different sub-designs and specify the scan
chain allocation.
11. The next phase involves testability analysis after scan insertion. Analyze
the testability of your design using the check_test command.
12. Execute ATPG on a sample fault list to check for any ATPG conflicts which
might exist. The command shown below generates test patterns for 5% of the faults
in the design:
13. The next step is the JTAG synthesis phase. Group all the core logic except,
three-state cells associated with three-state and bi-directional ports, into a separate
level of hierarchy.
The variable three_state_cell_list, is a user-defined variable which lists the
instances of three-state cells.
14. Set current_design to the TOP LEVEL ofthe design hierarchy.
15. Specify the order of the boundary scan register (BSR) cells using the
set_jtag_port_routing_order command.
16. Perform ITAG insertion with the required options. Use the –no_pads option
ifthe ASIC vendor library does not have pad cells.
17. Perform testabilty analysis after ITAG insertion using the check_test command.
19. Group all the JTAG logic into a separate level of hierarchy, and assign a
test_dont_fault attribute on them, to avoid being considered in the fault coverage
calculation. The design consists ofthe core instance surrounded by all the JTAG
logic and three-state buffers.
20. In order to control and specify the characteristics of the desired pad cell, use
the set_pad_type command. The DC inserts pads for all ports in the design which
have the 'port_is""pad" attribute. This attribute can be applied using the
set_port_is_pad command.
21. Execute ATPG using the following command. This creates a .vdb file which is
a binary vector file.
22. Finally, generate the test vectors in the required fonnat.
Unit IV
Constraining and Optimizing Designs
Introduction
After a design has been described in HDL and functionally simulated, the next
step involves logic synthesis using DC. The core of the synthesis process is the
constraints specified on designs and the timing reports generated by DC.
Synthesis Background
The Design Compiler attempts to meet two basic constraints or goals for
optimization in the following order of priority
1. Optimization Constraints
2. Design Rule Constraints
Figure 1 shows the two types of synthesis constraints and the related dc_shell
commands. Optimization constraints are user specified constraints.The two
optimization constraints are speed and area constraints. In addition to optimization
constraints, the synthesis tool is required to meet another set of constraints called
Design Rule Constraints (DRC). DRC are constraints imposed upon the design by
requirements specified in the target ASIC vendor library.
Consider an example
Max_fanout
Example shows the output pin of an AND gate driving the input pins of three
inverters as shown in above Figure. The input pins of each of these three gates
has a fanout_load attribute specified in the library. The sum of the fanout loads of
each of the three input pins must not exceed the Max_fanout
of the output pin of the AND gate. It is typically an integer, although each of
these fanout_load
values implies a certain standard load. One can find the fanout load on a specific
input pin of a library cell (say, AND2 in library libA) using the following
dc_shell command:
Instead, if the library has a default fanout_load attribute set on the technology
library, we can find this value using the following command
Max transition
Max_transition is the longest time for a transition from logic level 0 to 1, or vice-
versa, for an entire design or for a specific net in a design. To be more specific, it is
the RC time which is the product of the resistance (R) and the capacitive load (C).
In the DC terminology, max_transition can be defined as the product of rise/fall
resistance and the capacitive load on a net. When the user specifies a
max_transition constraint in addition to the one already specified in the technology
library,the more restrictive constraint will apply.
For example, if the library has a max_transition of 5 and the user were to specify
a max_transition of 3, then the DC will try to meet a max_transition requirement
of 3.
Max_capacitance
The max_transition design rule constraint does not provide a direct control over
the actual capacitance of nets. The max_capacitance design rule constraint was
introduced to provide a means to limit capacitance directly.This constraint behaves
similarly to max_transition, but the cost is based on the total capacitance of the net
instead of the transition time. The max_capacitance constraint is fully independent,
so one can use it in conjunction with max_transition. Max_capacitance attribute
can be specified on designs or ports. MaxJransition, maxJanout and
max_capacitance can be used to control buffering in a design.MaxJanout,
max_transition and max_capacitance constraints can be
specified using the following commands :
Optimization Constraints
Speed and area constraints as specified by the user are the optimization
constraints. The speed constraints are specific delay constraints. One can specify
timing constraints from one specific port/pin in the design to another provided such
a timing path exists between the two specified points. In general, detailed timing
constraints help get the best results from synthesis. To specify a max delay of 0
and expect the fastest design is not the best optimization strategy. Similarly for
area, specify the expected area or a lower value than expected.
Specifying all clocks in the design using the create_clock command will constrain
all synchronous paths in the design. To constrain the asynchronous paths in the
design, one can use the max_delay and min_delay commands.
Prior to version 3.0a of DC, max_delay and min_delay commands were
used to specify timing constraints. But with 3.0a and subsequent versions, the
recommended methodology is to use set_input_delay and set_output_delay
commands instead. Only for asynchronous paths, must one use the max_delay and
min_delay commands to specify point to point delays.
Cost Functions
All the remaining paths are grouped into the default path group. If no clocks
are specified, then all paths default to the default path group. Since the synthesis
tool is primarily path based, it is possible to attach different weights to different
path groups. For example, consider a design with three clocks. If the weightage of
each path group was the default of 1, and if the worst violation in each group was
1,2 and 3 respectively, then the max delay cost calculation is as follows:
Max_Delay Cost = (1 x 1) + (1 x 2) + (1 x 3) = 6.0
If no explicit area constraints are specified, then area optimization occurs only if
timing constraints are not met. Since synthesis results are dependent to a large
extent on a number of factors such as
constraints, libraries and coding styles, optimization of a design is an iterative
process.
In the event of a hand instantiated clock tree, during synthesis, one must
place a dont_touch attribute on the clock network using the dont_touch_network
command. This command ensures that the entire clock network in the design
inherits a dont touch attribute.
Let us analyze these timing reports. The report gives the point in the
design, which is usually a port or a pin of a library cell, the incremental delay
through the cell (listed in the “Incr" column), and the 'Path" delay (listed under the
'Path" column) or the delay in the path upto that point. In other words, the path
delays are calculated by adding up the incremental delays
Consider the first path beginning at the first sequential element f_reg
and ending at the next sequential element d_reg in Figure. The rising edges of the
clock are at 0 and 5 ns. So for f_reg assuming no clock network delays (which is
the default condition), the clock rise occurs at 0 ns, the clock to Q delay of FD2
flop is 1.42, the delay through AND gate is 0.82, giving a data arrival time of 2.24
at the data pin of d_reg. The register d_reg has its first rising edge at O. At this
stage data from f_reg had not yet arrived. However,for the next rising edge at 5,
the situation is different since data from f_reg arrived at 2.24 ns. Since the rising
edge is at 5ns and the library has a setup requirement of 0.85 for FD2 flop, the
latest a signal can arrive to avoid setup time violations is 5 - 0.85 = 4.l5. This
implies that the constraint has been met with a positive slack of 1.91 ns.
By specifying a clock period of 5 ns, we have implicitly placed a max_delay
constraint of 4.15 from clock pin of f_reg to data pin of d_reg. In the second report,
data arrives at 0 ns, and the clock requires that data arrive latest by 4.15 ns
implying that setup time is met with a slack of 4.15 ns.
The following report was generated after specifying an input delay of 3 ns on the
input port A using the set_input_delay command. Set_input_delay is similar to
set_output_delay, except that it accounts for timing delays at the input. For
example, an input_delay of 3 ns on input port 'A', implies that relative to the rising
edge of clock, elk, there is a delay of 3 ns, due to logic or otherwise prior to the
port 'A'.
Commonly Used Design Compiler Commands
Few basic DC commands and switches and their usage are discussed.
1. dont_touch
This is a very useful command, particularly when dealing with hierarchical
designs. After one has specified the constraints and compiled a design to achieve
the required results, it is often required that this design not be reoptimized when
used in a larger design. In such cases, one would specify a dont_touch
attribute on the instance of that design in the higher level design.
For example, say block A has been optimized to satisfaction and has been used in
another design TOP as shown in Figure.
Say the instance name of block A in TOP is ul, then the following dc_script
performs the dont_touch step:
The dont_touch attribute can be removed using the remove attribute command.
Structuring on the other hand, is used to improve the area or gate count of a
design. It involves the addition of intermediate terms which are then shared by
different outputs. In this sense, it can be considered as a reverse process of
flattening.
Structuring is of two kinds, namely, timing driven and boolean structuring.
While timing driven structuring is executed by default, the latter is not.
Timing driven structuring takes into account time delays when structuring the
design, while Boolean structuring does not. Further boolean structuring results in a
2X to 4X increase in compile time. This can be very significant when dealing with
large designs.
After structuring,
5. prefer
The “prefer " command changes the priority of cells chosen by the Design
Compiler during technology translation. Technology translation is essentially the
process of mapping a netlist from one technology library to another. This
command assigns the prefer attribute to the specified cells. For example, one can
place a prefer attribute on the library cell IVB in library libB. This
causes the DC to infer the IVB cell each time a cell of that functionality is
required.
6. set_default_register_type
set_defaultJegister_type command specifies the default flip flop or latch
to be used from the target library during technology translation.One can force DC
to select a particular latch or flip flop from the target library by using the same
command with a -exact option as shown below.
7. Characterize
The characterize command is used extensively in hierarchical designs. For
example, consider a design TOP with two sub-blocks sub1 and sub2. Let us
assume that both subl and sub2 have been compiled individually and have met
their constraints. However, when instantiated in TOP, subl and sub2 have different
constraints depending on constraints on TOP and the logic surrounding subl and
sub2 in TOP.
The characterize command helps capture the constraints imposed on the sub design
by the surrounding logic.
1. Capturing the entire design in one large HDL file, reading that file into DC,
specifying the following constraint,
2. Dividing the design into too many hierarchical sub blocks. This is the other
extreme of the strategy 1. This is not recommended for two reasons.
Firstly, managing the design with several sub-blocks can be rather cumbersome.
Scenario 1-- You have a design written in HDL. You have a very limited idea of
the timing requirements. You simply wish to attain the fastest possible design.
A simple strategy to realize the optimal design is to experiment first with a default,
medium effort compile, specifying absolutely no constraints before the compile
step. This should give you a feel for the timing/area performance of your block.
Then you specify your approximate timing (clocks and point to
point timing constraints, if any) requirements.
A large number is used so that DC lists all the paths that fail to meet timing
requirements in the design. If a number of paths are violated by a large margin,
then you know right away that meeting your timing is likely to be a
difficult/impossible task. On the other hand, if very few paths violate timing, then
the next step would be to execute another compile with the default medium effort.
Then, re-asses your paths in the report. If you see serious timing delays or very
little improvement over the first timing report, then one or more of these must be
attempted.
• Re-assess your code and consider alternate design partitioning.
• The technology library does not have cells to meet your timing.
• Timing requirements must be more realistic with regard to the capabilities of cells
in the technology library.
• Identify any functional false paths or multi-cycle paths that might exist and
specify them.
Scenario 2 -- You have written your source code, you know the detailed timing
requirements, from characterize or otherwise.
Assess your results. Use the group_path command to assign higher weightage to
paths which show greater violations. Use the compile_default_critical_range
variable. The final step could be an incremental compile. This is used only to make
very minimal improvements in timing, usually less than 2 ns. In general, the more
specific you can be in specifying constraints, the better the synthesis results.
Scenario 3 -- You have fairly accurate timing requirements, but your main motive
is to improve rather than merely meet the requirements. You are confident from
knowledge of your library cells and earlier compile iterations that DC can meet
timing, but your intent is to get the fastest possible design.
If constraints are already close to being met, then specify tighter constraints.
compile
You now meet timing but wish to improve upon this.
Now specify tighter constraints -- faster clock or tighter max_delay constraints for
asynchronous paths. Execute report_timing again, they should now violate your
delay constraint. Do not specify unrealistic constraints, like max_delay 0 for
instance. Instead, gradually tighten constraints.
Scenario 4 -- Area is extremely critical in your design. While you think you could
meet timing, area is an issue you would like to monitor right from the very start of
your synthesis process. Given below are some tips for effective area optimization:
• Prior to the initial compile one must try and specify very accurate constraints to
prevent DC from overkill ofnon-critical paths.
• After synthesis, execute the check_design command. Analyze the results to make
sure there is no unused logic in design. Useful details about the design such as
unconnected ports, feedthroughs, and multiple drivers are provided by this
command.
• Use the report_resources command to check implementations of resources in the
designs and also on how many resources are inferred. There might be scope for
sharing of resources by modifying the HDL
code.
• You could try ungrouping the hierarchy. Although this might improve area, it
might make place and route task extremely difficult.
• Flatten appropriate unstructured random logic blocks using the set_flatten
command on these blocks.
1. For better results from synthesis, specify accurate point to point delays for
asynchronous paths. Use the create_clock and group_path commands to
constrain synchronous paths in the design. In general, the synthesis tool is
tailored towards path optimization. Hence, it responds better to a greater
detail of constraints.
2. Try to register outputs of the different design modules. This saves the
designer from having to perform painstaking time budgeting. Constraining
different hierarchical modules becomes easier for two reasons.
The drive strength on the inputs to a block is equal to the drive strength of
the average flip flop. Secondly, the input delays are equal to the path delays
through a flip flop, given that the outputs of the driving hierarchical block
are registered.
3. Separate negative and positive edge flip-flops into separate hierarchical
blocks. In other words, avoid having both kinds of flops in the same
hierarchical module. This makes the debug process and timing analysis
during synthesis much simpler. Moreover, this can help simplify test
insertion.
4. Group finite state machines and optimize them separately. State machine
extraction and optimization process is more effective when the fsm is
isolated. The group -fsm command can be used to achieve this.
5. The recommended size of a module for synthesis is in the range 250-5000.
There are bound to be exceptions to this generalized recommendation.
6. Avoid having too many hierarchical blocks. Optimization across
hierarchical boundaries is far less effective than when the boundaries do not
exist. On the other hand having a large flat design with no hierarchy is not
the solution.
7. Try to capture logic in the critical path into a separate level of hierarchy.DC
does a better job of optimization when the critical path does not traverse
hierarchical boundaries. This can be done by ungrouping existing blocks and
re-grouping them using dc_shell scripts.
8. Compile Time: If your compile time is too long, then it is most likely due
to one of the following reasons:
• You are using high map effort. Try the default medium effort. This is the
recommended compile effort and hence is the default. The compile time for
high map effort is dependent on the machine configuration and the size of
the design.
• Your design is too large and must be broken down into smaller hierarchical
modules.
• You have declared false paths which traverse hierarchical boundaries or
any path exceptions specified in the design such as set_multicycle paths.
• You have glue logic at the top level of your design. Consider incorporating
this into hierarchical sub modules using the ungroup/group commands.
• You are trying to flatten a design which is not appropriate for flattening. In
general, use the 'flatten" switch only for random logic. For a design with
over twenty inputs, flattening is almost never completed. If the number of
inputs is less than ten, then flattening is more likely to complete.
• You have boolean optimization turned on. Again, this is appropriate only
for random logic. If you do have random logic in your design, consider
grouping it into a separate level of hierarchy and compile it separately with
the flatten or boolean structuring switch turned on.
9. For datapath logic, consider the option of instantiating logic (like gates
and muxes) or inferring them through user developed DesignWare libraries
10. Partitioning the design is extremely crucial to get the best out of
synthesis. Identify signals with large fanouts and attempt to group the
driving logic with the logic being driven into one hierarchical block.
11. It is always advisable to perform a preliminary round of synthesis and
place and route so as to identify any serious issues which may require re-
writing the HDL code.
Classic Scenarios.
Case 1 : You wish to find all the clocks defined in your design and their clock
periods within a dc_shell script file. Using this information, you then wish to
specify some constraints and attributes related to the clocks.
Case 2: Can one specify dont_care conditions for the condition branches of a case
statement?
Solution: A typical scenario is when one cares only about certain inputs in a
particular state but not the other inputs. DC does not support dont_cares for case
statement conditions because of simulation mismatches. In the simulation world, a
string to string matching is performed and this applies to
the dont_care conditions as well.
Case 3: The DC is unable to meet the timing for the path which is the worst
violator. However, it does not seem to improve on other paths in the design which
most certainly can be improved by merely swapping cells in those paths.
Solution : By default, DC creates a default path group and a clock group for each
clock created. The default path group contains paths that do not terminate at a
clock. Only the worst violator in each path group affects the synthesis cost
function. This can be changed by using the group_path command or modifying the
value of the compile_default_critical_range variable from the default of 0.0 to a
larger value. In general, set the compile_default_critical_range variable only in the
last compile step. In other words, set constraints and perform one or more compile
steps until the DC does not seem to improve its results. Then set this variable to a
value (usually 2 or 3) then re-compile.
Setting this to a large value can increase the compile time significantly. The
group_path command can be used to create explicitly a path group and specify the
weight and critical_range of that group. No path can exist in more than one path
group.
Case 4: A top level module has a few submodules and a hand-crafted clock
circuitry at the top level. You wish to synthesize this design to gates but leave the
clock logic at the top level intact. How does one go about accomplishing this?
Case 5: How does one find all the cells of a particular reference in a hierarchical
design? In other words, you have a hierarchical design with the FDI (flip-flop)
library cell used several times and you wish to get an actual count to identify if it is
worth requesting a special low drive cell of the same functionality.
Solution: The simplest way would be to ungroup the design from the top level and
use the report_reference command. Alternatively, if one prefers not to ungroup the
design, a script which finds all the cell instances which reference the FD1 should
accomplish the same. Then the total number of
cells in this list is counted.
Solution: Resource sharing is done during the first compile and the license used is
the HDL-Compiler license. One can actually prevent an HDL Compiler license
from being used during compile. Another way to accomplish the same is to execute
the replace_synthetic command before compile. This will, however, disable the
high level optimizations that occurs during compile (including timing-driven
resource sharing). This can impact the quality of results.
Case 7: A design has an address bus 32 bits wide of which only 2 bits go into a
module. You create an extra level of hierarchy in DC using the group command
and only 2 bits of the address needed go into the newly created module. DC brings
in a1l32 bits into the module and does not connect the top 30. Is there a way to get
rid of the unused bus ports?
Solution: One can remove unused ports using the remove_port command as shown
below:
Case 8: In a state machine process, if a state is supposed to remain the same under
a certain condition, does the user have to explicitly write next_state < =
currentstate;Since nothing new is assigned to it shouldn't it maintain the state even
if not specified?
Solution: If you do not have the next_state < = current_state statement in the
combinational process statement, the DC will infer latches for the next_state signal
Solution: No, there is no way to control instance names except by adding pre-fixes
and suffixes. This can be achieved using the following variables:
Unit V
Constraining and Optimizing Designs for FSM
First the HDL source code is mapped to cells from a target technology
library. Then the flip-flops in the design which hold the current state of the FSM
must be identified.
The set_fsm_encoding command allows the designer control over the state
encoding. While several encoding styles for FSMs exist, we will discuss the
auto (default encoding style) encoding styles. The encoding style
can be assigned using the set_fsm_encoding_style command. The group-fsm
command groups the state flip-flops and the associated combinational logic
into a separate level of hierarchy. On extraction, the state machine can be
written out in state table format.
Figure shows the top level design generated from the VHDL code.
Notice that there are three flip-flops.
1. Before a state machine has been extracted and after the group command,DC
sometimes fails to group some of the surrounding logic which would then have
made the state machine logically more optimal. The cells which are grouped are
those in the transitive fan in/out of the state vector cells.After grouping, one might
find two inputs to this grouped level of hierarchy which are the opposite (inverted)
of each other. In this case, one must use the characterize -connections command
with the current_design set to the top level, so that the connection attributes are
passed on to the newly grouped level of hierarchy.
2. Once the state machine is extracted, the design can be written out in state
machine format or to the original RTL VHDL format by the following steps:
3. While the flip-flops inferred in the examples are all D-flip-flops, it is possible to
force DC to map to specific flip-flops from the target_library using the
set_register_type –flip_floP cell_name or the set_register_type-flip_floP cell_name
–exact commands.
4. When the concerned nets are inputs to the state machine,one is to set the
variable, write_name_nets_same_as_ports to true (this is false by default) before
writing the design in EDIF. Then read the EDIF back into DC and follow the steps
till extraction of the FSM. After reading in the edif file,the ports and the nets
connected to them should have same names. It is advisable to do a compare_design
between the new design read in and the old design in memory, to ensure that no
changes occurred during the write step.
5. reduce_ fsm and set_ fsm_ minimize are two commands users tend to confuse.
reduce fsm is more a command while set fsm minimize is more of a switch.
reduce_ fsm should be executed after the extract command to reduce the transition
logic between states. The set_fsm_minimize is turned on prior to compile so that
the tool infers the minimum number of states required for the fsm.
6. Last the efforts must be made to clearly partition the design into control logic
and data path elements. Reading in a large netlist and executing the extract
command is not an effective methodology.
Once the max_delay requirements imposed due to the setup constraints for
the sequential cells have been met, DC then attempts to fix the minimum path
delay requirements. Since the path delays are the maximum in the 'Worst case"
timing analysis or 'Worst case" operating conditions, max delay requirements must
be met in the "worst case" operating conditions.
The minimum delay requirements are set by the hold constraints for the
sequential cells. Hold time problems are caused due to short delay paths between
registers which cause the data signal to propagate through two adjacent flip-flops
on a single clock edge. Since path delays are the shortest under 'best-case"
operating conditions, hold time problems are maximum in these conditions. Hence,
hold violations have to be fixed under these conditions.
One approach to go about fixing both the setup and hold constraints is a
two pass compile approach. In the first pass compile, fix the setup violations under
the “Worst-case" operating conditions. Then set the operating conditions to 'best-
case" for the second pass compile. Use the 'fix_hold" command to set an attribute
'fix_hold" on the clock objects for which hold constraints have to be met. The
second pass compile should be with the compile switch "-only_design_rules"
turned on. This should fix all the hold violations in your design. Also since under
the 'best-case" operating conditions, the max_delay paths will have excessive
positive slack, hold constraints maybe fixed at the cost of setup constraints. Such a
situation can be avoided by adjusting the constraints such that the critical paths in
the design appear critical under best-case conditions.
Technology Translation
Technology Translation in DC
In DC, technology translation is performed by the translate command. In
order to perform translation from one technology to another, the first requirement
is the availability of both the existing library to which the netlist has been mapped
and the target_library. Shown below are the steps involved in translating a design
top from technology libA to technology library libB.
The translate command replaces each cell in the design with the closest matching
functional cell from the target_library. In case such a matching cell is not found
then it is converted to a cell from the generic library. The dont_use, dont_touch,
set_default_register_type and the prefer command are useful commands that affect
the translation process.
However, there exists a simple trick to translate a black box cell. Black-
box cells are cells with a function attribute which cannot currently be described in
the Synopsys Library Compiler syntax or those which do not have a function
attribute specified. Such cells have the 'b' attribute attached to them implying a
black box cell. The report_libcommand can be used to identify all the attributes
on the cells in the library as shown below:
It is possible that the design prior to translation has one such cell instantiated
in it and the target library does contain an identical cell. Since the DC does not see
any functionality described, it is unable to translate this particular instance. A
design with black-box cells can be translated by the following steps:
1.Identify the black-box cells in your design and then find the equivalent cells in
the target_library.
2. Create a translation library for these black-box cells. For example, if your netlist
has a black-box cell 'mem"and your target_library contains an equivalent cell
'mem_new': then create a translation library which is essentially a module that
instantiates the target cell mem_new, but with the same interface as the mem cell
as shown in below Verilog example.
3. After the translation library has been created, convert the design to the db format
using the read and the write commands. You have now created a block around the
cell in the target_library with an interface similar to the interface ofthe black-box
cell in the current library netlist. Assuming that your translation library is called
translation. db, your original library orlginal.db, and your new target technology
library new.db, set the link_library variable as follows:
Also, ensure that the search_path variable points to all the directories
containing these libraries.
4. Execute the link command to translate the black-box cells in the netlist to the
library new.db. During the link operation, the DC checks the link_library for cells
beginning with the translation.db library, followed by original.db and finally, the
new.db. On finding the mem design in translation.db, it links to the newly created
design.
During translation, the mem cell is nothing but a sub-block with the
mem_new instantiated in it, and mem_new is a cell in the target_library new.db.
This level of hierarchy can later be removed with the ungroup command.
If there is no exact equivalent cell in the target library, you can create a
structural model of the black box cell using primitives from the target technology
library. Then, as in the above case, create a translation library with the same
interface as the black box.
Pad Synthesis
Adding pads to your design is an essential part of the design process. One
option is to instantiate pads after the core ofthe design has been implemented and
simulated.
Figure shows an ASIC core with the pad cells. DC provides a means for
automatic pad insertion. However, this is entirely dependent on the ASIC vendor
library having appropriately modeled pad cells.
A pad cell in the Synopsys library is one which has the pad_cell attribute
set to true. Also, one or more ofthe pins ofthe pad cell will have the is _pad
attribute set on them. Hence, the first step to attempting pad synthesis is to ensure
that the technology library has pad cells modeled appropriately.The following
commands can be used to determine all the pad cells in the technology library.
Having determined the pad cells available in the technology library, the
next step involves pad insertion. This is done using the insert_pads command.
However, if we wish to control the kind of pad cell inserted by the DC, this can be
achieved using the set_pad_type command. This command controls the attributes
and properties of the pad cell synthesized by DC. To provide greater control, the
set_pad_type command has a "-exact' option which helps the user explicitly
specify the pad cell to be inserted from the library.
The insert_pads command does not bus together inputs into the same pad
using bused pad cells. Such pad cells will have to be instantiated. The DC does not
map to pad cells during the regular compile if the pad cells have required attributes.
Classic Scenarios
Case 1: We are performing trial compile runs. We do not wish that wire loads be
considered in these trial runs. Can one prevent the DC from selecting a wire_load
model for a design, or does it default to a particular wire load model?
Case 2: Our design has a number of internally generated signals which drive
the enable pins of latches. For example, we have a state machine generated
signal which drives the enable pin of a latch. The output of the latch drives a
block of combinational logic, which in tum drives a primary output as shown
in below Figure. The time delay in the signal reaching the primary output is
dependent on how soon the enable signal can be generated, and the delay
through the combinational logic, after the data is latched. We wish to
constrain the entire path along the enable line to the primary output.
Solution: There is no constraints on the enable (clock) line since there are no
setup requirements on the clock pin. Consider a two step constraint approach
using the set_output_delay and the max_delay commands. The path from the
enable pin of the latch to the primary output can be constrained using the
set_output_delay command. The path to the enable pin of the latch can be
constrained using the max_delay command.
Solution: The fanout violations seen at the top level were not fixed on
compiling the characterized sub-block because no fanout_load values were
applied to the output ports of the lower level. In other words, characterize
does not capture the fanout_load drive capability required by the output ports
E, F and G in above Figure. The values that were applied by characterize, were
load values which are not taken into account when fixing max_fanout
violations.Characterize command will capture this information if one were to use
the characterize-constraints command instead of just characterize. This will
ensure that the fanout_load values are passed down in addition to the load values
on the nets.
The characterize command also has another useful option, namely,
connections. This is useful in a scenario where two inputs are identical
except that one of them is an inversion of the other as shown in above Figure.
Input X2 is inverted and drives Block A at pins A and B. This information is
captured when Block A is characterized with the -connections option. One
can explicitly specify that two ports are opposite of each other using the
set_opposite command.
Case 4: We wish to find all data pins of latches in your design. Is there a single
command which will accomplish this ?
Case 5: We have several instances in your design which have dont_touch attributes
placed on them. We now wish to ungroup them, but are unable to remove the
don’t_touch attribute on an instance using the remove_attribute command.
Solution: It is likely that the instance has inherited the dont_touch attribute from
its reference. If this is indeed the case, we should first remove the don’t touch
attribute from the reference. Use the remove_attribute command with the find
command as follows
Case 6: Our technology library has a default_max_fanout specified. But the DC on
synthesis does not seem to buffer your clock line accordingly.
Case 7: After inserting pads using the insert_pads command you find clock pads
inserted for some of the inputs.
Solution: Clock pads should normally be inserted only for the ports with a clock
object created on it. However, they might be inserted on other ports if those ports
are part of clock gating logic. If the pads are being inserted on a compiled netlist
that contains clock enable buffers, then those ports connected to the clock enable
buffers may have clock pads inserted on them also. For other regular inputs, clock
pads should not be used. This problem can be avoided by specifying the
set_pad_type -no_clock attribute on all inputs, except the clock input, prior to pad
insertion.