PD

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 76
At a glance
Powered by AI
The key takeaways are that PD involves both mechanical and electrical activities for preparing artwork and meeting design constraints at the silicon level. It also requires knowledge of semiconductor technology, SoC architecture and design, EDA tools, and system design.

The main steps involved in the PD process are feasibility analysis, planning, implementation, verification, tapeout, fabrication, packaging, testing, characterization, validation, and respin if required.

Some considerations for power management in chip design include using a multi-power domain design, power reduction techniques like a power optimized library, and average power dissipation being as important as maximum power dissipation for product competitiveness.

PD Introduction

What is PD?
Mechanical activity towards preparation of artwork for the mask preparation Electrical activity towards making the design meets its constraints at silicon level

Knowledge requirements for PD



Semiconductor knowledge (minimal) SoC architecture (the more the better) SoC design (knowledge on constraints) Technology library (mandatory, dont play with strange things) Methodology (good understanding on why this way and why not the other way) EDA tool (mandatory to work hands-on) Scripting (to reduce pain) System design (to understand the application requirement)

Full-custom Vs Semi-custom
Integrity Full custom Area Full custom Power Full custom Performance Full custom Yield Full custom Cost Large volume full custom Development cycle semi custom Development cost semi custom Reusability semi custom Myth: All digital design are Semi custom and analog designs are full custom.

Full custom Inverter

Abstract view of the inverter used for semi custom

PD Project cycle
ASIC / SoC requirement feasibility plan Implementation verification tapeout fabrication package assembly tester characterization validation respin (if required)

ASIC / SoC PD requirements


At top level Size Power Speed Cost (equally important) At lower down Many more which attribute to the above parameters

PD Feasibility analysis
At introductory level, Feasibility analysis is needed to avoid surprises halfway into the design For better planning and prediction of time, resource and success rate upfront Scope of feasibility analysis: Can the constraints be met Can the chip be designed Will the chip work What are the risks involved Will revisit feasibility analysis in detail

PD Plan
Project management comprises of
Requirement to meet Time to complete Cost for the project including resource, tool

Requirements can be,


RTL GDS / Netlist GDS DFT : BS/ Internal scan/ MBIST/ LBIST Macros: PLLs, memories, spl. IOs & cells Methodology: technology related, SI, EM/IR, Design based: Split power, multi Vt Package: BGA/ QFP/ Flipchip, std / custom

Will revisit planning in detail.

PD Implementation
Full chip/ subchip/ top level/ chip closure ? (die-size) (die-shape) (timing) (IR) (EM) (powerplan) (floorplan) (IO) (CTS) (routing) (SI) (rail) (DRC/LVS) (STA) (dyn. Sim) (XRC) (sign-off) (tapeout) (metal slot) (metal fill) (PO/OD fill) (GDSII) (seal & scribe) (constraints analysis)
What more? Where to start & where to end?

PD verification
What to verify?
Will the chip work? Will the chip perform? Will the chip be reliable? Can the chip be manufactured? Will all the functions required be met? Will the chip be testable?

PD design closure
What is design closure?
Timing closure Power closure DRC/LVS cleanup SI closure

PD chip closure
What is chip closure?
Dummy metal fill Poly / Oxide fill Wirebond check Seal ring and seal scribe Stress release pattern Scribe and scribe lane

Ready to upload the GDSII to fab

PD tools
What for PD tools? Implementation Analysis Verification Major tool vendors Cadence Synopsys Mentor graphics Magma Tools are task simplifiers. They are not problem solvers, instead they can also be source for problem

PD Methodology
What is methodology? What is flow ? Need for flow flush

PD Requirement analysis

Customer Input (General input)

Name Of the Design Application domain New Design / Respin Design If it is Respin Design a. Reasons for failure of the earlier Design b. Nature of Enhancements on Respin Design Synthesis tool used Design Tool & Sign off Tool to be used a. Physical design b. Physical verification c. STA d. SI e. Power analysis Targeted package Name of the Foundry Data input for the Design (RTL/NETLIST/GDS II/ECO) a. RTL b. Netlist with DFT c. Netlist without DFT d. Optimzation f. IO list and pad order

Customer input (general input)


Formal Verification Reports Provision of test setup for physical design Time Frame for the design

Flat or Hierarchical Design

On site/ Off site design


Sign-off criteria Technical point of contact (name / telephone)

Customer input (design details)

Design Details Explanation of Design including Architecture Estimated No. Of Logic Gates in the design Total gate count of the design. No. of IO pins in the Design Max I/P Frequency of the Design Max O/P Frequency of the Design Max Internal Frequency of the Design No. of clocks in the Design High Fan-out Nets Defined that require Buffer Tree Synthesis Gated clock used or not Synchronous Reset / Asynchronous Reset used in the Design Any Latch used in the Design (specify) Wirebond or flip chip Timing Report of Synthesis Design Constraint Format Tool used for dumping the SDC Usage if thru' - thru' constraints in SDC Need for recovery / removal

Customer input
Library Details Name of the Library Vendor Which process technology to be used (for e.g. 0.18u,0.13u) Target library to be used (for e.g. LV/Generic/Multi Vt Library) Core and IO Voltage to be used No. of Metal Layers to be used and Routing Guidelines

Type of pads to be used

Customer input
Macro Details a. SP RAMs (size, mux factor, no.of instances) Analog IP Block used (Provide details)

LEF, DEF, .lib, .db, Constraints and GDSII for hard Macros b. DP RAMs ( size, mux factor, no.of instances) Onchip Memory Requirements (Provide details)

c. Register files (size, mux factor, no.of instances)


Soft Macros List Intended package

Customer inputs
Floorplan Details Die size requirement Die aspect ratio

Floor plan details with PAD order and location


Inline / staggered pad OR flip chip design Floorplan Guide lines (Any specific requirements) Data flow diagram Intended package

Power Plan Details


SSOs in the IOs

Customer input
Estimated core power Estimated IO power

IO types and voltages

Need for multi Vt Anticipated total power dissipation

Need for multiple power domain


Need for clock gating

Customer input
Clock Tree Synthesis Details Propogated clock details Clock domains with multicycle paths Bugeted clock duty cycle variation Budgeted clock jitter margin clock tree diagram for the design Inter & intra clock phase matching constraints Clock domains with half cycle paths Budgeted Clock Skew in the Design No. of clocks in the design Required setup margin Required setup margin Target insertion delay Clock gating details clock details Derived clocks details

Clock frequencies

DFT input

JTAG pin list (Muxed / non muxed) No. of scan chains Length of each scan chain No. of MBIST controllers MBIST controller details (groupings) Max scan clock frequency Estimated switching activity during scan test Clock gating details Test SDC Common functional & test mode paths with constraints Size of DFT logic a. MBIST logic b. LBIST logic c. JTAG controller Scan chain report with pipe line logic for scan tracing Required set-up margin Required hold margin

Technology input
Runsets for the sign-off tool Standard cell library- full viewSynopsys Milkyway a. DRC b. LVS

Design Rules for the technology / process

c. Antenna
d. Metal slotting e. Metal filling f. Poly & Oxide filling g. wire bond DRC h. Sample GDSII of seal ring with CSR

IO pad library Library for special IO pads Bond pads De-cap and endcap cells Macros Filler cells and pad fillers

Points to be gathered from tech files


From the IO pad library
From std cell library a. number of library entities b. clock buffers and inverters c. rise/ fall time match in the clock components. d. Gate gensity for the technology e. dc and ac characteristics of the standard cells f. power rating of the standard cells g. noise margins h. characteristics at min-max conditions i. Intrinsic delay statistics non seq. components j. Clock to Q delays for the FF k. max fanout, max trans and max cap parameters

a. types of IO pad entities

b. operating voltage
c. operating current d. Dimension e. pitch of the pads f. pad delay characteristics

From Design Rule a. current density of different layers b. Sheet resistance c. electrical rules

d. Poly and oxide related rules


f. dummy metal filling g. Metal slotting h. wire bond rules i. Scribe and seal ring

Memory input
SL .N o Memor y instance name Memory (type/size/mux factor) X dimen sion Y dimen sion M Area (um2) Bit cells Power * Cloc k1 Cloc k2 BIST group

Checklist for memories


Technology library of the memory tallies with the std cell library for the design .lib, .db, .tf, .itf, .tlf, FRAM and GDSII libraries are available The dimensions mentioned are correct Power numbers specified are correct Document on the data flow among the memory blocks are made available- for the purpose of floor planning

BIST logic for the memory logic is incorporated Memeory data sheet to be referred for the layer details, power ring details and the placement and routing blockages for the memory macros Check with the foundry MT form for the required memory details and generate them upfront
Check for the maximum dimensions and feasibility of fitting in the die area

Macro input
SL.N o Macro instance name Macro type X dimens ion Y dimens ion Area (um2) Logic gates Pow er No. of clocks Clk. Frequ.

Checklist for macros


Technology library of the memory tallies with the std cell libray for the design .lib, .db, .tf, .itf, .tlf, FRAM and GDSII libraries are available The dimensions mentioned are correct Power numbers specified are correct Document on the data flow among the macro blocks are made available- for the purpose of floor planning Macro data sheet to be referred for the layer details, power ring details and the placement and routing blockages for the memory macros Check for the maximum dimensions and feasibility of fitting in the die area In the layout viewer open the FRAM view and check for the connectivity pins

In the GDS view check for the metal layer, blockages and the power plan used for the macro internally
Check for the clocks, clock constraints, matching or latency requirements if any at the macro input Run the DRC, LVS, Antenna runsets on the macros in stand alone mode and check they pass the checks before going for full design

IO input
SL. No Signal Group Pad instance name Directi on SSO group Toggle rate pad order pin/ ball map

Checklist for IO pads


Check the type of IO pads used belong to the technology files used Check for the pad size and the minimum pitch to be used for the pads Check for special pads like Analog pads which normally may not be part of the free library

Check for requirement of power cut diodes, power on control cells for the design
Check for the type and availability of the bond pads to be used in the design (staggered/ in-line) Check for the pre driver and post driver power requirements

Check for the power on sequence for the core and IO powers
Check for the availability of proper power pads for the IO and core power Analyze the current surges that could occur due to simultaneous switching of the SSO group signals Check the availability of pad fillers in the technology library for forming the pad ring

Clock input
Sl. No

Cloc k name

F req ue nc y

Cla ssific ation*

Insert ion delay

J Dut it Sk Fa t ew y cycl e limi nout e r t

M ax tra ns

Ma x Ca p

U Sync. Points

Interse of clock edge domain s

Checklist for clock


Obtain a clock tree diagram indicating the different logic blocks being driven by the clocks Obtain the number of sequential elements used in those blocks and estimate the clock tree size When the tree size is very large there is likely-hood of clock duty cycle distortion as well as large clock path delay, clock skew Check the design library is suitably edited with clock tree components separated from std. cell components For critical clocks with tight duty cycle requirement, make sure the rise time/ fall time matching of the clock buffers/ inverters Check for interclock domains and the corresponding false path declarations in the SDC Observe for MCP declarations in the SDC and the associated clock domains Check the MCPs are declared for both set-up and hold time, the MCP for hold is less by 1 to the MCP for set-up Check for recovery removal timing closure required for the clock domains and make note of the same for the CTS design flow Check if preset / clear paths also need timing closure and set the flow for the same Check for the test / scan clocks and their SDCs Analyze the common paths for functional and test paths and the impact of the MCP declarations on them In the derived clock, check for the Q to D feed-back path and the impact of hold time closure on the phase distortion Check if clock gating is used for any clock domain, appropriately set the CTS flow Check for propogated clocks in the design, accordingly set the flow for the respective CTS Check for async resets in the design

Feasibility Analysis

Technology information

Floorplanning

PD- Floorplanning
FP is the critical part in PD High quality FP ensures accurate circuit timing & performance Poor FP results in timing failure, routing congestion, larger power, larger area, huge IR drop and SI issues

PD- Floor plan


Floor plan involves decision on,
pin/pad location hard macro placement placement and routing blockage location and area of the soft macros and its pin locations number of power pads and its location.

Floor plan tips


While fixing the location of the pin or pad always consider the surrounding environment with which the block or chip is interacting. This avoids routing congestion and also benefits in effective circuit timing Provide sufficient number of power/ground pads on each side of the chip for effective power distribution. In deciding the number of power/ground pads, Power report and IR-drop in the design should also be considered

Floor plan tips


Flyline analysis should be done while placing the macros Orientation of these macros forms an important part of floorplanning

Floor plan tips


Avoid spreading standard cells in several areas and creating small placement traps, with many pockets and isolated regions between the macros that can trap a standard cell and limit the routing access A physical design engineer must focus on having homogeneous standard cell area with aligned macros

Floor plan tips


Create standard cell placement blockage at the corner of the macro because this part is more prone to routing congestion. Also create standard cell placement blockage in long thin channel between macros Avoid uneven routing resources in the design by using the proper aspect ratio (Width /Height) of the chip For designs that have horizontal overflow, to increase utilization, cell row separation is increased which in turn helps increase horizontal routing resources

Floor plan tips


In hierarchical design, Cluster based implementation enables to place the standard cells of the given module in predefined region Analog block are more susceptible to noise and signal routes going over such block cause signal integrity issues, routing blockages on all layers are to be defined for analog blocks Time and efforts that are put in floorplanning save iterations and make design cycle faster

More FP tips
At any level, avoid routing that goes against the preferred routing direction for that level. When creating metal rings around cores and blocks, remember to allow room for routing access to pins

More FP tips
When placing blocks, avoid creating four-way intersections in top-level channels T intersections create much less congestion. This consideration can be critical to leaving the necessary space for routing channels, depending on how much over-the-cell routing is possible. Using flylines can help determine optimized placement and orientation

More FP tips
For placing block-level pins,
First determine the correct layer for the pins Spread out the pins to reduce congestion. Avoid placing pins in corners where routing access is limited Use multiple pin layers for less congestion Never place cells within the perimeter of hard macros. To keep from blocking access to signal pins, avoid placing cells under power straps unless the straps are on metal layers higher than metal2 Use density constraints or placement-blockage arrays to reduce congestion Avoid creating any blockage that increases congestion.

More FP tips
Need to supply power and ground to areas where they might be useful for placing buffers or repeaters during the postplacement timing-convergence optimization and for top level buffers Consider grouping multiple instances of any logical hierarchical element to form one hierarchical physical element. Look for logical modules in the RTL design representation that can be grouped in hierarchical blocks Also group small blocks into one larger block It is easier to floorplan with same-sized blocks. Try to work with midsized blocks. A design partitioned in six to 12 roughly equivalent-sized blocks constitutes a reasonable candidate for floorplanning Depending on the package design, you usually want to start the floorplan with I/Os at the periphery

More tips on FP
Consider parts of the design that are not typical standard cells:
memories analog circuitry (PLLs) logic that works with a double-speed clock blocks that require a different voltage exceptionally large blocks unusual design-specific instances (flash)

place these elements first to ensure that their special needs are accommodated

More tips on FP
If two or more large blocks or other features that make a reasonable floorplan impossible, you may have to increase the die size or rearrange I/Os If any of the large blocks are soft IP, repartitioning that block into smaller pieces Arrange rest of the blocks in the remaining space based on their I/Os and power consumption Avoid placing blocks that consume lot of power near center For average libraries, the usage is around 70% High percentage of registers or hard IP increases the percentage Large numbers of multiplexers or other small, pin-dense cells decrease percentage Run initial synthesis to find out how big the blocks are

POWERPLAN

Power planning

Power plan guidelines


Core power
Based on the routing resources availability the butting of the std. cell rail to be decided The core ring and mesh / strap widths and separation are planned based on the power estimate and the metal layer chosen for power network Identify the high power macros and high speed blocks in the design Enhance the power for the high power circuits by creating additional ring around them Do a preplace power analysis and adjust the mesh widths, separation or additional meshes wherever larger IR drop is anticipated Decide if placement blockage is necessary below the power network

Power-plan guidelines
Place filler cells before cell placement for the rails to get formed correctly Do a power only DRC after completion of power plan before doing placement Do metal slotting for the thick power network prior to placement Plan for power cut diodes wherever isolation is required between two power domains (e.g. analog and digital) Plan for power on sequence cells as required in the design

Power-plan guidelines
Over design of power network would result in suboptimum die-size Placement blockages under the power mesh would result in congestion Select higher metal layers having higher current density for power network Having higher metal layers for power would also act as heatsink If power network is inadequate IR and EM violations are foreseen

Powerplan guidelines
Fill any open spaces with power-mesh metal. Make sure the extra metal does not push signal wires closer together, thus increasing capacitance, powerconsumption, and signal-integrity problems. Floorplanning and power planning constitute an integrated process

Power planning guidelines


If possible, metal width should be limited to avoid the need for metal slotting Power and ground rings should be created around any hard macro to enable orientation independence and eliminate the need for the chips power structure to conform to the macros power structure Once the power rings have been established, power and ground must be routed to the standard cell rows The lowest horizontal metal layer should be used for these additional rails Insert filler cells temporarily to get a complete grid. After insertion of the rails, the filler cells are removed The rail spacing consistent with the standard cell height, but the designer must specify rail width Straps and trunks distribute power across the chip and represent the most important means to address specific IR drop issues Designers must determine the appropriate spacing, width, and layer of these straps and trunks It is better to use many thin routes, rather than fewer wide routes, especially in the lowest metal layers, to improve overall routability.

Placement

PD Placement flows

PD placement

Placement guidelines
In order to meet the tight design specifications under schedule and design resource constraint, the decision whether to implement a specific block in a semi-custom flow or ASIC flow had to be made at the early definition stages. Each datapath is first synthesized in a standard ASIC flow to check timing and power feasibility. Only in cases where the results did not meet the design target it is needed to proceed to implement the block in the semi-custom design flow

Placement
It is easier to analyze and work in block level for complex designs even if hierarchical flow is not required It helps in identifying the issues and bottlenecks at the block level It eases the design by allowing constraint relaxations where the margins are available Flat implementation gives better results than hierarchical For block level analysis we need to generate block level constraints separately.

Placement
Even for flat designs, look for RTL hierarchy for grouping datapath Where constraints are stringent, consider creating cluster groups and regions for their placement For boundary scan cells create space closer to IO pads for the placement. This would help reducing routing congestion Also for high fanout inputs consider buffering at the input pad

Placement
Do a Zero RC analysis upfront to check the consistency of the timing results with synthesis results Perform a scan trace and compare with the DFT report for consistency of the scan chain report In case of scan tracing getting stuck it is necessary the get it closed by seeking help from DFT team Detach the scan chain even before the preplacement stage With pre-placement results analyze the trans, cap and fanout numbers and optimally set these numbers for the flow. Consider grouping the MBIST controller logic closer to the corresponding memory macros During inplace optimization runs study the convergence w.r.t optimization options Suitably order the sequence of optimization options and the number of iterations required. This would yield better result as well as reduce run time Make the optimized placements as dont touch when working with subsequent optimizations Its is a good practice to do DRC/LVS check after each optimization

Placement
Set the setup margins to small +ve value (few ps). This will take care of CTS degradation (non-ideal clock) Alternately have some clock uncertainty number for clocks to account for CTS needs Enabling hold for timing check is waste of time during placement Check the placement for both functional and DFT constraints Choose appropriate report summary so that time for report generation and report size are minimized Try and achieve timing closure at placement stage, legal placement of the cells happen during placement Non convergence at placement stage would call for floorplan change No significant timing improvement is expected in the subsequent stages Too large a under performance may need a re-look of the design or consideration of a better library

PD Journey
Fanout based delay modeling no more valid Layout base delay modeling needed Latest trend is Physical Virtual Prototyping

Power management

Multi-power domain design

Power reduction

Power optimized library


The VIP PowerSaver library includes cells specifically optimized for high-performance, lowvoltage operation as well as the level shifters and isolation gates that allow the designer to create electrically independent power islands capable of operating at different voltage levels and frequncies. The current library contains over 700 cells for the TSMC CL013G (130 nm) process, characterized for operation at 0.8, 1.0 and 1.2 volts. Libraries for additional processes will become available over time.

Front-end and Back-end


The more front-end teams consider the constraints imposed by implementation-level physical effects, the fewer iterations are likely to be required to achieve closure.

Considerations for Die-Size


Placement optimization simultaneously adjusts block placement, aspect ratio, rotation and mirroring to achieve a realistic die size estimate. A process shrink may exceed the maximum power density limits Average power dissipation is at least as important as maximum power dissipation for overall competitiveness of a processor product

Methodology evolution

Utilization Area
Note that utilization ratio is the total cell area over the boundingbox area. On the one hand The layout area has to leave enough room for physical synthesis to buffer interconnects, size drivers and restructure logic and for CTS and routing to route the design; not leaving enough room will cause either an impact on timing or even the failure to complete the route. On the other hand The layout area should be minimized to minimize the die size and thus the cost of the chip.

Therefore it is hard for the user to come up with the right utilization ratios.

You might also like