document
document
Abstract—In this paper, we present DeFer—a fast, high-quality, hierarchical framework is preferred by modern application-
scalable, and nonstochastic fixed-outline floorplanning algorithm. specific integrated circuit designs. Nevertheless, fixed-outline
DeFer generates a nonslicing floorplan by compacting a slicing
floorplanning has been shown to be much more difficult, com-
floorplan. To find a good slicing floorplan, instead of searching
through numerous slicing trees by simulated annealing as in pared with classical outline-free floorplanning, even without
traditional approaches, DeFer considers only one single slicing considering wirelength optimization [3].
tree. However, we generalize the notion of slicing tree based
on the principle of deferred decision making (DDM). When
two subfloorplans are combined at each node of the generalized A. Previous Work
slicing tree, DeFer does not specify their orientations, the left– Simulated annealing has been the most popular method of
right/top–bottom order between them, and the slice line direction. exploring good solutions on the fixed-outline floorplanning
DeFer even does not specify the slicing tree structure for small
subfloorplan. In other words, we are deferring the decisions on problem. Using sequence pair representation, Adya et al. [4]
these factors, which are specified arbitrarily at an early step in modified the objective function, and proposed a few new
traditional approaches. Because of DDM, one slicing tree actually moves based on slack computation to guide a better local
corresponds to a large number of slicing floorplan solutions, all of search. To improve the floorplanning scalability and initially
which are efficiently maintained in one single shape curve. With optimize the interconnections, in [5] the original circuit is
the final shape curve, it is straightforward to choose a good
floorplan fitting into the fixed outline. Several techniques are first cut into multiple partitions by a min-cut partitioner.
also proposed to further optimize the wirelength. For both fixed- Simultaneously, the chip region is split into small bins. After
outline and classical floorplanning problems, experimental results that, the annealing-based floorplanner [4] performs fixed-
show that DeFer achieves the best success rate, the best wirelength, outline floorplanning on each partition within its associated
the best runtime, and the best area on average compared with bin. In [6], Chen et al. adopted the B*-tree [7] representation
all other state-of-the-art floorplanners.
to describe the geometric relationships among modules, and
Index Terms—Deferred decision making, fixed outline, floor- performed a novel three-stage cooling schedule to speed up
planning, layout optimization. the annealing process. In [8] a multilevel partitioning step
is performed beforehand on the original circuit. Different
I. Introduction from [5], the annealing-based fixed-outline floorplanner is
performed iteratively at each level of the multilevel framework.
F LOORPLANNING has become a very crucial step in
modern very large scale integration (VLSI) designs. As
the start of physical design flow, floorplanning not only deter-
By enumerating the positions in sequence pairs, Chen et al. [9]
applied insertion after remove (IAR) to accelerate the simu-
lated annealing. As a result, both the runtime and success rate1
mines the top-level spatial structure of a chip, but also initially
are enhanced dramatically. Recently, using Ordered Quadtree
optimizes the interconnections. Thus, a good floorplan solution
representation, He et al. [10] adopted quadratic equations to
among circuit modules definitely has a positive impact on the
solve the fixed-outline floorplanning problem.
placement, routing, and even manufacturing. In the nanometer
All of the above techniques are based on simulated anneal-
scale era, the ever-increasing complexity of integrated circuits
ing. Generally, the authors tried various approaches to improve
(ICs) promotes the prevalence of hierarchical design. However,
the algorithm efficiency. But one common drawback is that
as pointed out by Kahng [1], classical outline-free floorplan-
these techniques do not have a good scalability. They become
ning [2] cannot satisfy such requirements of modern designs.
quite slow when the size of circuits grows large, e.g., 100
In contrast with this, fixed-outline floorplanning enabling the
modules. Additionally, the annealing-based techniques always
Manuscript received January 31, 2009; revised May 31, 2009 and October have a hard time handling circuits with soft modules, because
7, 2009. Current version published February 24, 2010. This work was partially they need to search a large solution space, which can be time-
supported by International Business Machines, Faculty Award, and National
Science Foundation under Grant CCF-0540998. This paper was recommended consuming.
by Associate Editor L. Scheffer. Some researchers have adopted nonstochastic methods.
The authors are with the Department of Electrical and Computer Engi- In [11], a slicing tree is first built up by recursively partitioning
neering, Iowa State University, Ames, IA 50010 USA (e-mail: zijunyan@
iastate.edu; cnchu@iastate.edu). the original circuit until each leaf node contains at most
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. 1 Success rate is defined as the ratio of the number of runs resulting a layout
Digital Object Identifier 10.1109/TCAD.2010.2041850 within fixed die, to the total number of runs.
0278-0070/$26.00
c 2010 IEEE
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
368 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010
two modules. Then the authors rely on various heuristics to nodes, both orders between them, and both horizontal and
determine the geometry relationships among the modules and vertical slice lines. Note that the work in [17] and [18]
output a final floorplan solution. Sassone et al. [12] proposed only generalized the orientation for individual module
an algorithm containing two phases. First the modules are and the slice line direction, respectively. In order to carry
grouped together only based on connectivity. Second the mod- out the combination of generalized slicing trees, we also
ules are packed physically by a row-oriented block packing extend original shape curve operation to curve Flipping
(ROB) technique which organizes the modules by rows based and curve Merging.3
on their dimensions. But this technique cannot handle soft • Enumerative Packing: To defer the decision on the slicing
modules. In [13], Zhan et al. applied a quadratic analytical tree structure within small subfloorplan, we develop the
approach similar to those used for placement problems. To enumerative packing (EP) technique. It enumerates all
generate a nonoverlapping floorplan, the quadratic approach possible slicing structures, and builds up one shape curve
relies on a legalization process. However, this legalization capturing all slicing layouts among the modules of small
is very difficult for circuits with big hard macros. Cong subfloorplan. The naive enumeration is very expensive
et al. [14] presented an area-driven look-ahead floorplanner in terms of CPU time and memory usage. But using the
in a hierarchical framework. Two main techniques are used technique of dynamic programming, EP can be efficiently
in their algorithm: the ROB and zero-dead space (ZDS). To applied to up to 10 modules.
handle both hard and soft modules, ROB is extended from • Block Swapping and Mirroring: To make the decision
[12]. ZDS is used to pack soft modules. But, ROB may on the subfloorplan order (left–right/top–bottom), we
generate a layout with large whitespace when the module sizes adopt three techniques: Rough Swapping, Detailed Swap-
in a subfloorplan are quite different from each other, e.g., a ping [11], and Mirroring. The motivation is to greedily
design with big hard macros. optimize the wirelength. As far as we know, we are
the first proposing the Rough Swapping technique and
B. Our Contributions showing that without Rough Swapping Detailed Swapping
This paper presents a fast, high-quality, scalable, and non- may degrade the wirelength.
stochastic fixed-outline floorplanner called DeFer.2 It can Additionally, we adopt the following three methods to
efficiently handle both hard and soft modules. enhance the robustness and quality of DeFer.
DeFer generates a final nonslicing floorplan by compacting • Terminal Propagation (TP): DeFer accounts for fixed pins
a slicing floorplan. It has been proved in [16] that any by using TP [19] during partitioning process.
nonslicing floorplan can be generated by compacting a slicing • Whitespace-Aware Pruning (WAP): A pruning method is
floorplan. In traditional annealing-based approaches, obtaining proposed to systematically control the number of points
a good slicing floorplan usually takes a long time, because the on each shape curve.
algorithms have to search many slicing trees. By comparison, • High-Level EP: Based on EP, we propose the high-level
DeFer considers only one single slicing tree generated by EP technique to further improve the packing quality.
recursive partitioning. However, to guarantee that a large By switching the strategy of selecting the points on the final
solution space is explored, we generalize the notion of slicing shape curve, we extend DeFer to handle other floorplanning
tree [2] based on the principle of deferred decision making problems, e.g., classical outline-free floorplanning,
(DDM). When two subfloorplans are combined at each node For fixed-outline floorplanning, experimental results on Gi-
of the generalized slicing tree, DeFer does not specify their gaScale Systems Research Center (GSRC) Hard-Block, GSRC
orientations, the left–right/top–bottom order between them, Soft-Block, hybrid blocks (HB) (containing both hard and soft
and the slice line direction. For small subfloorplan, DeFer modules), and HB+ (a hard version of HB) benchmarks show
even does not specify its slicing tree structure, i.e., the skeletal that DeFer achieves the best success rate, the best wirelength,
structure (not including tree nodes) in the slicing tree. In other and the best runtime on average, compared with all other state-
words, we are deferring the decisions on these four factors of-the-art floorplanners. The runtime difference between small
correspondingly: 1) subfloorplan orientation; 2) subfloorplan and large circuits shows DeFer’s good scalability. For classical
order; 3) slice line direction; and 4) slicing tree structure. outline-free floorplanning, using a linear combination of area
Because of DDM, one slicing tree actually represents a large and wirelength as the objective, DeFer achieves 12% better
number of slicing floorplan solutions. In DeFer, all of these so- cost value than Parquet 4.5 with 76× faster runtime.
lutions are efficiently maintained in a single shape curve [17]. The rest of this paper is organized as follows. Section II
With the final shape curve, it is straightforward to choose a describes the algorithm flow. Section III introduces the Gen-
good slicing floorplan fitting into the fixed outline. To realize eralized Slicing Tree. Section IV describes the Whitespace-
the DDM idea, we propose the following techniques. Aware Pruning. Section V describes the Enumerative Pack-
• Generalized Slicing Tree: To defer the decisions on these ing technique. Section VI illustrates the Block Swapping
three factors: 1) subfloorplan orientation; 2) subfloorplan and Mirroring. Section VII introduces the extension of
order; and 3) slice line direction, we generalize the DeFer on other floorplanning problems. Section VIII ad-
original slicing tree. In the generalized slicing tree, one dresses the implementation details. Experimental results are
tree node can represent both orientations of its two child
3 In this paper, all slicing trees and shape curve operation stand for the
2A preliminary version of DeFer was presented in [15]. generalized version by default.
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
YAN AND CHU: DEFER: DEFERRED DECISION MAKING ENABLED FIXED-OUTLINE FLOORPLANNING ALGORITHM 369
Fig. 1. Pseudocode on algorithm flow of DeFer. Fig. 3. Final shape curve with fixed outline and candidate points.
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
370 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
YAN AND CHU: DEFER: DEFERRED DECISION MAKING ENABLED FIXED-OUTLINE FLOORPLANNING ALGORITHM 371
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
372 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010
into the fixed outline, and the first and last points P1 , Pd
out of δ are on the right and top boundary of the fixed
outline [see Fig. 7(b)]. For various points/layouts, Wo is
different. We use the one of P1 to approximate Wo . As
in pruning we always keep the point that is the closest to
(1 + βi )hip , here we can assume h1p+1 = (1 + β1 )h1p . So we
have
Wo = A1 · ((1 + β1 )δ−1 − 1). (8)
M
βi
N
Eventually, the pruning parameters βi returned by WAP are
subject to Ai · + Wcj + Wo ≤ Wb (9)
2 j=1 used to systematically prune the points on the shape curve of
i=1
each subpartition i. Best of all, we do not need to worry about
Wo = A1 · ((1 + β1 )δ−1 − 1) the over-pruning and degradation of the packing quality.
βi ≥ 0 i = 1, . . . , M.
V. Enumerative Packing
C. Solving WAP In order to defer the decision on the slicing tree structure,
To solve WAP (9), we relax the constraint related with Wb we propose the EP technique that can efficiently enumerate all
by Lagrangian relaxation. Let possible slicing layouts among a set of modules, and finally
λ be the nonnegative Lagrange
multiplier, and W = Wb − N j=1 Wcj − Wo
keep all of them into one shape curve.
√
ln( Ai /h )
M i M
βi
Lλ (βi ) = 1
+λ·( Ai · − W ) A. Naive Approach of Enumeration
ln(1 + βi ) 2
i=1 i=1 In this section, we plot out a naive way to enumerate
LRS : Minimize Lλ (βi ) all slicing packing solutions among n modules. We first
subject to βi ≥ 0 i = 1, . . . , M. enumerate all slicing tree structures and then enumerate all
permutations of the modules. Let L(n) be the number of
LRS is the Lagrangian relaxation subproblem associated with different slicing tree structures for n modules. So we have
λ. Let the function Q(λ) be the optimal value of LRS. The n
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
YAN AND CHU: DEFER: DEFERRED DECISION MAKING ENABLED FIXED-OUTLINE FLOORPLANNING ALGORITHM 373
TABLE I
Comparison on # of ‘‘⊕’’ Operation
n # of ⊕ # of ⊕
by Naive Approach With DP
2 1 1
3 6 6
4 45 25
5 400 90
6 4155 301
7 49 686 966 Fig. 9. Illustration of high-level EP.
8 674 877 3025
9 10 295 316 9330
10 174 729 015 28 501
However, if the set contains too many modules, two prob-
lems appear in EP: 1) the memory to store results from subsets
slicing tree structure. Finally, these curves from all slicing tree can be expensive; and 2) since the interconnections among the
structures are merged into one curve that captures all possible modules are not considered, the wirelength may be increased.
slicing layouts among these n modules. To show the amount Due to these two concerns, in the first step of DeFer, we
of computations in this process, we list the number of “⊕” apply hMetis to recursively cut the original circuit into multiple
operations for different numbers of modules in the second smaller subcircuits. This process not only helps us to cut
column of Table I. down the number of modules in each subcircuit, but initially
optimizes the wirelength as well. Later on as applying EP
B. Enumeration by Dynamic Programming on each subcircuit, the wirelength would not become a big
concern, because this is only a locally packing exploration
Table I shows that the naive approach can be very expensive among a small number of modules. In other words, in the
in both runtime and memory usage. Alternatively, we notice spirit of DDM, instead of deferring the decision on the slicing
that the shape curve for a set of modules (M) can be defined tree structure among all modules in the original circuit, first we
recursively as follows: fix the high-level slicing tree structure among the subcircuits
S(M) = MERGE (S(A) ⊕ S(B)). (11) by partitioning, and then defer the decision on the slicing tree
A⊂B,B= M−A structure among the modules within each subcircuit.
S(M) is the shape curve capturing all slicing layouts among
D. High-Level EP
modules in M, MERGE() is similar to the Merging in Fig. 5(c),
but operates on shape curves from different sets. In the modern system-on-a-chip design, the usage of intel-
Based on (11), we can use dynamical programming (DP) to lectual property becomes more and more popular. As a result,
implement the shape curve generation. First of all, we generate a circuit usually contains numbers of big hard macros. Due
the shape curve representing the outline(s) of each module. to the big size differences from other small modules, they
For hard modules, there are two points7 in each curve. For may produce some large whitespace. For example in Fig. 9(a),
soft modules, only several points from each original curve are after partitioning, the original circuit has been cut into four
evenly sampled.8 And then starting from the smallest subset of subcircuits A, B, C, and D. A contains a big hard macro.
modules, we proceed to build up the shape curves for the larger Respecting the slicing tree structure of T4b , you may find that
subsets step by step, until the shape curve S(M) is generated. no matter how hard EP explores various packing layouts within
Since in this process the previously generated curves can be A or B, there is always a large whitespace, such as Q, in the
reused for building up the curves of larger subsets of modules, parent subfloorplan. This is because the high-level slicing tree
many redundant computations are eliminated. After applying structure among subcircuits has been fixed by partitioning, so
DP, the resulted numbers of “⊕” operations are listed in the that some small subcircuit is forced to combine with some big
third column of Table I. subcircuit. Thus, to solve this problem, we need to explore
other slicing tree structures among the subcircuits.
To do so, we apply EP on a set of subfloorplans, instead
C. Impact of EP on Packing
of a set of modules. As the input of EP is actually a set of
To control the quality of packing in EP, we can adjust shape curves, and shape curves can represent the shape of
the number of modules in the set. Consequently the impact both subfloorplans and modules, it is capable of using EP to
on packing is: The more modules a set contains, the more explore the layouts among subfloorplans. In Fig. 9(b), EP is
different slicing tree structures we explore, the more slicing applied on the four shape curves coming from subfloorplans
layout possibilities we have, and thus the better quality of A, B, C, and D, respectively. So all slicing tree structures (T4a
packing we will gain at the top level. and T4b ) and permutations among these subfloorplans can be
7 One completely explored. Eventually one tightly-packed layout can
point if the hard module is a square.
8 The number of sampled points on the whole curve is determined by be chosen during back-tracing step [see Fig. 9(c)].
Ai
A0 ρ + 4, where Ai is the area of soft block i, A0 is the total block area, Before we describe the criteria of triggering high-level EP,
and ρ is a constant (ρ = 10 000 by default). some concepts are introduced here as follows.
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
374 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
YAN AND CHU: DEFER: DEFERRED DECISION MAKING ENABLED FIXED-OUTLINE FLOORPLANNING ALGORITHM 375
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
376 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010
Fig. 16. Circuit n300 layouts generated by DeFer. (a) n300 hard block γ = 10%. (b) n300 hard block γ = 10%. (c) n300 hard block γ = 10%. (d) n300
soft block γ = 1%.
TABLE II
Comparison on GSRC Hard-Block Benchmarks [22] (γ = 10%)
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
YAN AND CHU: DEFER: DEFERRED DECISION MAKING ENABLED FIXED-OUTLINE FLOORPLANNING ALGORITHM 377
TABLE III
Comparison on GSRC Soft-Block Benchmarks [22] (γ = 1%)
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
378 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010
TABLE IV
Comparison on HB Benchmarks [24] (γ = 10%)
Circuit #Soft./#Hard. Aspect PATOMA [14] Capo 10.5 [5] DeFer #Valid Point
/#Net. Ratio Suc% WL (e+06) Time (s) Suc% WL (e+06) Time (s) Suc% WL (e+06) Time (s) /#Total Point
665 1 100% 2.84 7.04 0% – 183 100% 2.66 1.44 16/1571
ibm01 /246 2 0% – – 0% – 977 100% 2.70 1.28 11/1482
/4236 3 100% 5.60 1.66 0% – 696 100% 2.82 1.30 12/1490
1200 1 0% – – 0% – 456 85% 6.55 14.48 6/2348
ibm02 /271 2 0% – – – – > 2 days 100% 6.21 3.33 7/1161
/7652 3 0% – – 0% – 3726 100% 6.29 3.52 10/1144
999 1 100% 12.59 5.42 100% 10.70 566 100% 8.77 3.60 59/2684
ibm03 /290 2 100% 12.94 5.58 100% 12.01 1874 100% 8.89 3.49 40/2503
/7956 3 0% – – 0% – 2028 100% 8.99 3.59 44/2630
1289 1 0% – – 0% – 2752 100% 8.94 3.04 4/1492
ibm04 /295 2 0% – – 100% 17.77 5253 100% 8.96 3.12 9/1514
/10 055 3 0% – – 100% 16.32 2262 100% 9.64 6.31 12/2685
564 1 100% 12.27 14.21 0% – 458 100% 12.61 3.55 46/3369
ibm05 /0 2 100% 12.60 13.68 0% – 358 100% 12.73 3.52 46/3371
/7887 3 100% 13.19 13.85 0% – 411 100% 13.45 3.53 46/3371
571 1 0% – – 0% – 235 100% 7.87 3.66 53/2187
ibm06 /178 2 0% – – 0% – 592 100% 7.76 3.66 41/2235
/7211 3 0% – — 0% – 2831 100% 8.91 3.60 36/2196
829 1 0% – – 0% – 1094 100% 13.81 3.87 12/1527
ibm07 /291 2 100% 24.64 7.85 0% – 1270 100% 13.91 4.48 22/1625
/11 109 3 100% 24.34 8.68 0% – 2274 100% 14.32 4.26 18/1590
968 1 0% – – 0% – 2527 100% 13.95 5.44 15/1333
ibm08 /301 2 0% – – 0% – 1110 100% 14.16 5.40 17/1290
/11 536 3 0% – – 0% – 1958 100% 14.43 5.55 19/1309
860 1 0% – – 0% – 2273 100% 12.85 2.60 3/1495
ibm09 /253 2 0% – – 0% – 2670 100% 12.57 3.77 17/1486
/11 008 3 0% – – 100% 34.48 6652 100% 12.98 3.54 14/1486
809 1 100% 48.47 21.71 0% – 2353 100% 33.25 11.63 9/2576
ibm10 /786 2 0% – – Crashed Crashed Crashed 100% 34.23 18.00 14/2897
/16 334 3 0% – – 100% 53.64 2014 100% 36.59 16.52 9/2725
1124 1 100% 20.87 33.87 0% – 8070 100% 21.99 4.84 12/2218
ibm11 /373 2 0% – – 0% – 4732 100% 22.13 4.96 8/2207
/16 985 3 0% – – 0% – 2245 100% 22.83 4.67 7/2174
582 1 0% – – 0% – 3085 100% 29.72 10.95 20/2909
ibm12 /651 2 0% – – 0% – 864 100% 31.53 7.71 18/3011
/11 873 3 0% – – 0% – 19 952 100% 32.16 4.59 8/1957
530 1 0% – – 0% – 3401 100% 25.92 6.03 12/2553
ibm13 /424 2 100% 43.81 9.84 0% – 3662 100% 25.46 3.79 10/2048
/14 202 3 0% – – 0% – 3201 100% 26.47 3.83 8/2095
1021 1 100% 71.87 23.59 0% – 4253 100% 50.83 9.69 30/2976
ibm14 /614 2 100% 55.99 35.65 0% – 10 373 100% 51.67 9.70 34/2971
/26 675 3 100% 61.65 35.12 0% – 4976 100% 53.71 9.70 36/2971
1019 1 0% – – 0% – 3634 100% 64.18 9.71 25/1651
ibm15 /393 2 0% – – 0% – 6827 100% 63.17 9.13 19/1580
/28 270 3 0% – – 0% – 2902 100% 66.06 9.46 20/1623
633 1 0% – – Crashed Crashed Crashed 100% 56.88 16.79 18/3823
ibm16 /458 2 100% 88.33 16.55 0% – 8928 100% 58.55 14.55 24/4833
/21 013 3 100% 98.77 22.94 0% – 11 675 100% 59.91 12.84 18/4093
682 1 100% 102.45 41.75 Crashed Crashed Crashed 100% 95.92 10.43 32/3253
ibm17 /760 2 100% 96.46 46.63 0% – 2250 100% 95.48 10.41 27/3252
/30 556 3 100% 98.18 42.45 Crashed Crashed Crashed 100% 100.82 10.42 29/3252
658 1 100% 50.28 38.24 0% – 1083 100% 49.12 7.93 42/3106
ibm18 /285 2 100% 49.74 39.15 0% – 4630 100% 49.29 7.97 41/3128
/21 191 3 100% 52.26 36.97 0% – 5262 100% 51.39 7.97 41/3128
Normalized 0.43 1.28 3.28 0.12 1.72 789.79 1 1 1
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
YAN AND CHU: DEFER: DEFERRED DECISION MAKING ENABLED FIXED-OUTLINE FLOORPLANNING ALGORITHM 379
Fig. 17. Circuit ibm03 layouts generated by PATOMA, Capo 10.5, and DeFer (γ = 10% and τ = 2). (a) By PATOMA. (b) By Capo 10.5. (c) By DeFer.
TABLE V
Comparison on HB+ Benchmarks [23]
Circuit White- Aspect PATOMA [14] Capo 10.5 [5] DeFer #Valid Point
space γ Ratio Suc% WL (e+06) Time (s) Suc% WL (e+06) Time (s) Suc% WL (e+06) Time (s) /#Total Point
ibm01 26% 1 100% 4.67 4.44 – – > 2 days 100% 3.09 1.84 120/10 860
ibm02 25% 1 0% – – 100% 7.86 124 100% 6.17 15.28 45/3380
ibm03 30% 1 0% – – 100% 12.75 343 100% 9.19 4.01 102/5020
ibm04 25% 1 0% – – 100% 12.03 147 100% 10.26 14.15 63/5170
ibm06 25% 1 0% – – 100% 10.09 155 100% 8.78 5.01 84/3560
ibm07 25% 1 100% 16.38 23.41 100% 16.41 99 100% 15.48 4.55 12/3780
ibm08 26% 1 0% – – 100% 18.29 284 100% 18.73 19.25 106/5070
ibm09 25% 1 100% 16.62 25.45 100% 17.85 100 100% 16.66 4.22 12/3070
ibm10 20% 1 0% – – 100% 81.27 1685 100% 45.12 6.32 27/6880
ibm11 25% 1 100% 25.86 38.72 100% 28.26 149 100% 26.99 7.07 19/4150
ibm12 26% 1 0% – – 100% 52.46 126 100% 50.17 5.54 69/6880
ibm13 25% 1 100% 36.74 29.08 100% 40.22 299 100% 35.51 5.85 15/3860
ibm14 25% 1 100% 68.30 51.79 100% 73.89 410 100% 64.50 12.01 36/7870
ibm15 25% 1 0% – – 100% 92.79 474 100% 84.29 14.66 182/9900
ibm16 25% 1 100% 95.97 47.14 100% 153.02 595 100% 98.66 8.08 10/5770
ibm17 25% 1 100% 142.41 65.06 100% 146.03 440 100% 144.56 14.70 41/9540
ibm18 25% 1 100% 73.76 47.71 100% 75.92 224 100% 71.86 11.30 44/9160
Normalized 0.53 1.07 4.76 1.00 1.19 46.66 1 1 1
TABLE VI
Comparison on Linear Combination of HPWL and Area
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
380 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010
TABLE VII
Estimation on Contributions of Main Techniques and Runtime Breakdown in DeFer
by 100% and the area of remaining soft macros are reduced X. Conclusion
to preserve the total cell area. As a result, the circuits become As the earliest stage of VLSI physical design, floorplanning
even harder to handle. Due to the same reason, we set has numerous impacts on the final performance of ICs. In
Capo 10.5 to “-SCAMPI” mode, and run it only once. The this paper, we have proposed a fast, high-quality, scalable and
results are shown in Table V. DeFer achieves the 100% success nonstochastic fixed-outline floorplanner DeFer.
rate on all circuits, which is 1.89× better than PATOMA. Based on the principle of Deferred Decision Making, DeFer
Capo 10.5 also achieves 100% success rate, expect for one outperforms all other state-of-the-art floorplanners in every
circuit it takes more than two days to finish. In terms of aspect. It is hard to accurately calculate how much each
the HPWL comparison, DeFer is 7% and 19% better than technique in DeFer contributes to the overall significant im-
PATOMA and Capo 10.5. DeFer is also 5× and 47× faster provement. But we do have a rough estimation in Table VII,
than PATOMA and Capo 10.5. in which we also show the runtime breakdown of DeFer for
Both HB and HB+ benchmarks are considered to be very each set of benchmarks. Note that, the DDM idea is the soul
hard to handle, because these circuits not only contain both of DeFer. Without it, those techniques cannot be integrated in
hard and soft modules, but also big hard macros. As far as we such a nice manner and produce promising results.
know, only the above floorplanners can handle these circuits. Such a high-quality and efficient floorplanner is expected
Obviously, DeFer reaches the best result. We also monitor to handle the increasing complexity of modern designs. The
the memory usage of DeFer on such large-scale circuits, and source code of DeFer and all benchmarks are publicly avail-
observe that the peak memory usage in DeFer is only 53 MB. able at [25]. In the future, we will integrate DeFer into
5) Analysis of Points in DeFer: In Tables II–V, for each placement tools to handle large-scale mixed-size designs.
test case we list the number of valid points (#VP) within
the fixed outline and the total number of points (#FP) in the
final shape curve. Both #VP and #FP are averaged over all
successful runs. We have three observations as follows. Acknowledgment
• As the circuit size grows, #FP increases. The authors would like to thank Dr. I. Markov, S. Chen,
• For the same circuit with various τ, ideally #FP should T.-C. Chen, and the University of California, Los Angeles
be the same. But they are actually different in some test Computer-Aided Design group, for their help with Capo 10.5,
cases. It is because high-level EP reconstructed the final IARFP, IMF, and PATOMA, respectively. They are also grate-
shape curve for some hard-to-pack instances. As you can ful to Dr. T.-C. Wang and the anonymous reviewers for their
see high-level EP can significantly increase #FP, e.g., helpful suggestions and comments on this paper.
ibm12 in Table IV, which means it improves packing
quite effectively.
• Sometimes while other algorithms cannot satisfy the
fixed-outline constraint, #VP of DeFer is more than 100, References
e.g., ibm15 in Table V. This shows DeFer’s superior [1] A. B. Kahng, “Classical floorplanning harmful?,” in Proc. Int. Symp.
packing ability. Phys. Design, 2000, pp. 207–213.
[2] R. H. J. M. Otten, “Efficient floorplan optimization,” in Proc. Int. Conf.
B. Experiments on Classical Outline-Free Floorplanning Comput. Design, 1983, pp. 499–502.
[3] S. N. Adya and I. L. Markov, “Fixed-outline floorplanning through better
For the classical outline-free floorplanning problem, as far local search,” in Proc. Int. Conf. Comput. Design, 2001, pp. 328–334.
as we know, only Parquet 4.5 can handle GSRC benchmarks, [4] S. N. Adya and I. L. Markov, “Fixed-outline floorplanning: Enabling
hierarchical design,” IEEE Trans. Very Large Scale Integrat. Syst.,
so we compare it with DeFer on GSRC Hard-Block bench- vol. 11, no. 6, pp. 1120–1135, Dec. 2003.
marks. The results are averaged over 100 runs. The objective [5] J. A. Roy, S. N. Adya, D. A. Papa, and I. L. Markov, “Min-cut
function is a linear combination of the HPWL and area, which floorplacement,” IEEE Trans. Comput.-Aided Design Integrat. Circuits
Syst., vol. 25, no. 7, pp. 1313–1326, Jul. 2006.
are equally weighted. We add “-minWL” to the Parquet 4.5 [6] T.-C. Chen and Y.-W. Chang, “Modern floorplanning based on B*-
command line. As shown in Table VI, DeFer produces 32% trees and fast simulated annealing,” IEEE Trans. Comput.-Aided Design
less whitespace than Parquet 4.5, with 18% less wirelength. Integrat. Circuits Syst., vol. 25, no. 4, pp. 637–650, Apr. 2006.
[7] Y. C. Wang, Y. W. Chang, G. M. Wu, and S. W. Wu, “B*-Tree: A
Overall, DeFer is 12% better in the total cost, and 76× faster new representation for nonslicing floorplans,” in Proc. Design Automat.
than Parquet 4.5. Conf., 2000, pp. 458–463.
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.
YAN AND CHU: DEFER: DEFERRED DECISION MAKING ENABLED FIXED-OUTLINE FLOORPLANNING ALGORITHM 381
[8] T.-C. Chen, Y.-W. Chang, and S.-C. Lin, “A new multilevel framework [24] HB Floorplan Benchmarks [Online]. Available: http://cadlab.cs.ucla.edu/
for large-scale interconnect-driven floorplanning,” IEEE Trans. Comput.- cpmo/HBsuite.html
Aided Design Integrat. Circuits Syst., vol. 27, no. 2, pp. 286–294, Feb. [25] DeFer Source Code [Online]. Available: http://www.public.iastate.edu/∼
2008. zijunyan/
[9] S. Chen and T. Yoshimura, “Fixed-outline floorplanning: Enumerating
block positions and a new objective function for calculating area costs,”
Jackey Z. Yan received the B.S. degree in automa-
IEEE Trans. Comput.-Aided Design Integrat. Circuits Syst., vol. 27, no.
tion from the Huazhong University of Science and
5, pp. 858–871, May 2008.
Technology, Wuhan, China, in 2006. He is currently
[10] O. He, S. Dong, J. Bian, S. Goto, and C.-K. Cheng, “A novel fixed-
pursing the Ph.D. degree in computer engineering
outline floorplanner with zero deadspace for hierarchical design,” in
at the Department of Electrical and Computer Engi-
Proc. Int. Conf. Comput. Aided Design, 2008, pp. 16–23.
neering, Iowa State University, Ames.
[11] A. Ranjan, K. Bazargan, S. Ogrenci, and M. Sarrafzadeh, “Fast floor-
His research interests include very large scale
planning for effective prediction and construction,” IEEE Trans. Very
integration physical designs, specifically in algo-
Large Scale Integrat. Syst., vol. 9, no. 2, pp. 341–352, Apr. 2001.
rithms for floorplanning and placement, and physical
[12] P. G. Sassone and S. K. Lim, “A novel geometric algorithm for fast wire-
synthesis integrated system-on-a-chip designs.
optimized floorplanning,” in Proc. Int. Conf. Comput. Aided Design,
Mr. Yan’s work on fixed-outline floorplanning was
2003, pp. 74–80.
nominated for the Best Paper Award at the Design Automation Conference
[13] Y. Zhan, Y. Feng, and S. S. Sapatnekar, “A fixed-die floorplanning
in 2008. He received the Ultra-Excellent Student Award from Renesas
algorithm using an analytical approach,” in Proc. Asia South Pacific-
Technology Corp., Tokyo, Japan, in 2005.
Design Automat. Conf., 2006, pp. 771–776.
[14] J. Cong, M. Romesis, and J. R. Shinnerl, “Fast floorplanning by look-
ahead enabled recursive bipartitioning,” IEEE Trans. Comput.-Aided
Design Integrat. Circuits Syst., vol. 25, no. 9, pp. 1719–1732, Sep. 2006. Chris Chu received the B.S. degree from the Uni-
[15] J. Z. Yan and C. Chu, “DeFer: Deferred decision making enabled fixed- versity of Hong Kong, Hong Kong, in 1993, and
outline floorplanner,” in Proc. Design Automat. Conf., 2008, pp. 161– the M.S. and Ph.D. degrees from the University of
166. Texas, Austin, in 1994 and 1999, respectively, all in
[16] M. Lai and D. F. Wong, “Slicing tree is a complete floorplan represen- computer science.
tation,” in Proc. Design, Automat. Test Eur., 2001, pp. 228–232. Since 1999, Chu has been a Faculty with Iowa
[17] L. Stockmeyer, “Optimal orientations of cells in slicing floorplan de- State University, Ames. He is currently an Associate
signs,” Informat. Control, vol. 57, pp. 91–101, May–Jun. 1983. Professor with the Department of Electrical and
[18] G. Zimmerman, “A new area and shape function estimation technique Computer Engineering, Iowa State University. His
for very large scale integration layouts,” in Proc. Design Automat. Conf., research interests include computer-aided design of
1988, pp. 60–65. very large scale integration physical designs, and
[19] A. E. Dunlop and B. W. Kernighan, “A procedure for placement of design and analysis of algorithms.
standard-cell very large scale integration circuits,” IEEE Trans. Comput.- Dr. Chu received the IEEE Transactions on Computer-Aided Design
Aided Design Integrat. Circuits Syst., vol. 4, no. 1, pp. 92–98, Jan. 1985. of Integrated Circuits and Systems Best Paper Award in 1999 for
[20] G. Karypis and V. Kumar, “Multilevel k-way hypergraph partitioning,” his work in performance-driven interconnect optimization. He received the
in Proc. Design Automat. Conf., 1999, pp. 343–348. International Symposium on Physical Design (ISPD) Best Paper Award in
[21] S. Goto, “An efficient algorithm for the 2-D placement problem in 2004 for his work in efficient placement algorithm. He received the Bert Kay
electrical circuit layout,” IEEE Trans. Circuits Syst., vol. 28, no. 1, pp. Best Dissertation Award in 1998–1999 from the Department of Computer
12–18, Jan. 1981. Sciences, University of Texas. He has served on the technical program
[22] GSRC Floorplan Benchmarks [Online]. Available: http://vlsicad.eecs. committees of several major conferences including the Design Automation
umich.edu/BK/GSRCbench/ Conference, the International Conference on Computer-Aided Design, the
[23] A. Ng, I. L. Markov, R. Aggarwai, and V. Ramachandran, “Solving hard ISPD, the International Symposium on Circuits and Systems, the Design,
instances of floorplacement,” in Proc. Int. Symp. Phys. Design, 2006, Automation and Test in Europe, the Asia and South Pacific Design Automation
pp. 170–177. Conference, and the system level interconnect prediction.
Authorized licensed use limited to: Iowa State University. Downloaded on April 29,2010 at 19:31:34 UTC from IEEE Xplore. Restrictions apply.