A Comprehensive Review On The VLSI Design Performance

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Available online at www.sciencedirect.

com

ScienceDirect
Materials Today: Proceedings 11 (2019) 1001–1009 www.materialstoday.com/proceedings

I2CN_2018

A comprehensive review on the VLSI design performance of


different Parallel Prefix Adders
Rakesh Sa,b *, K. S. Vijula Gracea
0F0F

a
Dept. of ECE, Noorul Islam Centre for Higher Education, Thuckalay, Kanyakumari, Tamil Nadu, India

b
Dept. of ECE, Mangalam College of Engineering, Ettumanoor, Kottayam, Kerala, India

Abstract

Adders are an important part of digital systems. In VLSI digital circuits, such adders should satisfy certain design constraints like
low power and high speed. In this paper we present a review of the performance of some conventional adders and parallel adders.
Parallel Prefix Adders (PPA) are considered to be one of the fastest adders that had been designed and developed. Parallel Prefix
Adders were established as the most efficient circuits for binary addition. These adders which are also called Carry Tree Adders
were found to have better performance in VLSI designs. This paper investigates the performance of four different Parallel Prefix
Adders namely Kogge Stone Adder (KSA), Brent Kung Adder (BKA), Han Carlson Adder (HCA) and Hybrid Han Carlson
Adder (HHCA). In this paper the key contribution is the information about the structure of the Parallel Prefix Adders and their
performance parameters. This paper can serve as a reference to the beginners in the digital electronics and VLSI area to gain
more knowledge on the Carry Tree Adders.

© 2018 Elsevier Ltd. All rights reserved.


Selection and/or Peer-review under responsibility of International Multi- Conference on Computing, Communication, Electrical &
Nanotechnology: Materials Science.

Keywords: Parallel Prefix Adders; Carry Tree Adders; Kogge Stone Adder; Brent Kung Adder; Han Carlson Adder; Hybrid Han Carlson Adder

* Corresponding author. Tel.: +919995771518


E-mail address: raksarian@yahoo.com

2214-7853 © 2018 Elsevier Ltd. All rights reserved.


Selection and/or Peer-review under responsibility of International Multi- Conference on Computing, Communication, Electrical &
Nanotechnology: Materials Science.
1002 Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009

1. Introduction

The binary addition is the basic and most often used arithmetic operation in microprocessors, digital signal
processors and data processing application specific integrated circuits. The binary adder is the crucial element in
most digital circuit designs including digital signal processors (DSP) and microprocessors. Parallel Prefix Adders
(PPA) are variations of the well known carry look ahead adder (CLA). The two are different in the way their carry
generation block is designed and implemented [1]. The parallel prefix carry look ahead adder was first proposed
some twenty years ago as a means of accelerating n-bit addition in VLSI technology. It was widely considered to be
the fastest adder and used for high speed arithmetic circuits in the VLSI designs [2]. These adders employ a three-
stage structure. They have a pre-processing stage, a carry computation stage and finally a post processing stage as
shown in Fig. 1.

Fig 1. Structure of a Parallel Prefix Adder

2. Parallel Prefix Adder

Peter M Kogge et. al. developed a parallel algorithm for the solution of recurrence equations [3].

2.1. Kogge Stone Adder

According to Sunil M et. al. Kogge-Stone adder is a parallel prefix form of carry look-ahead adder. The delay
dependence of carry generation is O(log2N) time where N is the number of bits, and is popularly considered as the
fastest adder design. Digital applications use this adder architecture for high-performance adder designs in electronic
industry. In Kogge-stone adder, fast parallel computation of carries is done at the cost of increased area. Carry
generation blocks are the most important part in tree adders, and it consists of three components such as Black cell,
Grey cell and Buffer. Black cells are used to compute both generate and propagate signals. Grey cells are used in the
computation of generate signals which are needed in the calculation of sum in the post-processing stage. Buffers are
used to control the effect of loading. The work described a novel way of modifying the already existing Kogge-
Stone adder by re-routing and black cell reduction for increasing the speed of execution. The design put forward the
principle of removal of the redundant black cells and compensated this removed black cells by rerouting. The
proposed design has a lesser delay than the architectures that it is being compared with. The number of logic levels
used is also found to be less. The logic and routing delay is reduced and speed has shown an improvement of 9.84%
compared to normal Kogge-Stone adder. So parallel prefix adders of this type are the best choice in many VLSI
Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009 1003

applications where speed is the main constraint. [4]. A general structure of a 4 bit Kogge Stone adder is shown in
Fig. 2.

Fig 2. Structure of a 4 bit Kogge Stone Adder

Propagate bit, Pi = Ai ⊕ Bi (1)


Generate bit, Gi = Ai · Bi (2)

Figure 3 shows the functions performed by the black circle and grey circle.

Fig 3. Functions performed by the circles

The black circle generates


Group Generate, G = Gi + Pi · Gprevious (3)
Group Propagate, P = Pi · Pprevious (4)
The gray circle generates
Group Generate, G = Gi + Pi · Gprevious (5)
1004 Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009

Raghumanohar Adusumilli et.al. implemented Kogge Stone Adder (KSA) for 4, 8, 16, 32 and 64 bits using verilog
Hardware Description Language. The tool used for coding is Xilinx ISE 14.7 and the designs are simulated using
ISim Simulator and Altera Modelsim 10.1d. The efficiency of KSA was compared with Carry Look Ahead adder
(CLA) and Carry Skip Adder (CSA) with respect to speed and number of slice LUT’s used. The Kogge Stone Adder
was found to have low logic depth, high node count, and minimal fan out. The low logic depth and minimal fanout
corresponds to faster performance but a high node count implies a larger area. The Kogge Stone adder is the fastest
possible design, because it scales logarithmically. Every time we add a combining step, it doubles the number of bits
that can be added. The only issue in KSA is that as the number of bits increases the number of wires gets increased.
According to the results, the delay of the KSA is nearer to CLA for lesser number of bits. This is due to the wiring
or route delays which are more in KSA. When higher order KSAs are implemented the delays are very less
compared with others, because just by increasing single stage of computation we can design the next higher order
adder. [5].

Nurdiani Zamhari et. al. have stated that the Kogge Stone Adder (KSA) has regular layout which makes such adders
a suitable one for electronics applications. The second reason for the popularity of KSA is its minimum fan-out or
minimum logic depth. As a result of that, the KSA becomes a fast adder but has a drawback of having large area [6-
9].

Shanil Mohamed.N et. al paper proposed a high speed fault tolerant parallel prefix adder. Here, the high speed of
operation is achieved in the Sparse Kogge-Stone by incorporating Carry Select Adders instead of Ripple Carry
Adders. Fault tolerance is achieved by using two additional Carry Select adders. Synthesis and simulation for an
FPGA platform were carried out. The sparse Kogge-stone adder offered superior performance over a Ripple Carry
Adder for very large bit widths when implemented on an FPGA [10]

Soumya Banerjee et. al. proposed a design framework for a family of Parallel Prefix Adders, based on the concept of
sparsity. A generalization on the specific property of Group Segregation is used to achieve the proposed design.
Here the computations of even and odd bits (forming two groups) are mutually disjoint. Furthermore, the two
disjoint groups can produce the computations of each other with a reasonable hardware and performance overhead
[11].

Feng Liu et. al. presented a comparative study of new end-around carry (EAC) adder implemented with different
parallel prefix trees, targeting FPGA technology. This new adder is based on the fast 128-bit binary floating-point
EAC adder which has been implemented in the fused multiply-add unit of IBM POWER6 microprocessor. The
parallel prefix tree implemented on the IBM’s EAC adder is a Kogge-Stone tree which has been chosen for its high
performance and its low power consumption [12].

2.2. Brent Kung Adder

Richard P Brent et. al. proposed the design of parallel “carry look ahead adders” suitable for implementation in
VSLI architecture. The basis of the method was the reduction of carry computation to a “prefix” computation [13].
The number of levels in Brent Kung Adder (BKA) is high and it reduces the operational speed of the adder. BKA is
also power efficient because of its low node count and low area with large number of input bits. The delay of BKA
is equal to (2*log2N)-2 and has the area of (2*N)-2-log2N where N is the number of input bits. The BKA is known
for its high fan-out capability. It has high logic depth with minimum area [6-9].

K. Babulu et. al. investigated the performance of Brent Kung adder and it is compared with Ripple Carry adder,
Kogge Stone adder and Sparse Kogge Stone adder. Brent Kund adder was found to have less delay but occupied
more area. The tool Xilinx ISE 10.1 was used to code the design in Verilog and to synthesize the design onto
Spartan 3E FPGA. The design was simulated using ModelSim [14].
Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009 1005

Er. Aradhana Raju presented the 4,8,16 and 32 bits design of Carry Look Ahead (CLA), Carry Save Adder (CSA),
Kogge Stone Adder (KSA), Sparse Kogge Stone Adder (SKSA), Brent Kung Adder (BKA), Sklansky Adder,
Lander Fischer Adder (LFA) and Han Carlson Adder (HCA). The adders had been categorized and ranked based on
the delay, device utilization and cell usage. These adders were implemented in VHSIC Hardware Description
Language (VHDL) using Xilinx Integrated Software Environment (ISE) 9.2i Design Suite [15].

Darjn Esposito et. al. proposed five novel Variable Latency Adder (VLA) architectures, based on Brent-Kung,
Ladner-Fisher, Sklansky, Hybrid Han-Carlson, and Carry increment parallel-prefix topologies. They implemented
an efficient error detection and correction technique in the proposed VLAs which makes it suitable for applications
using 2’s complement representation. The proposed architectures had been synthesized using the UMC 65 nm
library, for operand lengths ranging from 32 to 128 bits and their performance was investigated. The obtained results
showed that the proposed Variable Latency Adder performed better than previous speculative and non-speculative
architectures when high speed is required [16]. The structure of a 4 bit Brent Kung adder is given in Fig. 4.

Fig 4. Structure of a 4 bit Brent Kung Adder

2.3. Han Carlson Adder

T. Han et. al. presented a combinational construction of a parallel prefix adder [17] using two designs: the Kogge-
Stone construction which has log2N stages and the Brent-Kung construction which has (2log2N–1) stages, where N
is the number of input bits. Brent Kung adder occupies a smaller area for the combinational circuits than the Kogge-
Stone design. The Han-Carlson adder utilizes the best feature of Kogge-Stone adder which is high speed, and best
feature of the Brent-Kung design which is low area, and combines both to provide a reasonably good speed at low
complexity [18].

Sreenivaas Muthyala Sudhakar et. al. explained the 16-bit Han-Carlson adder design which uses a single
Brent-Kung stage to compute the first stage of the parallel prefixes, followed by three stages of Kogge-Stone design
and terminated by another Brent-Kung stage for the final stage of the prefix computation. It is observed that the
1006 Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009

number of prefix computation stages for the 16 bit Han-Carlson adder is five (log216+1), which is one more than the
Kogge-Stone design (log216=4) for the same word size. However, the number of the prefix operations is fewer in the
Han-Carlson design (32 prefix operations) than in the Kogge-Stone design (49 prefix operations). Thus, the Han-
Carlson adder introduces an extra stage of delay but reduces the area used by the adder circuitry compared to the
Kogge-Stone adder. As the number of the prefix operations increase, the routing requirement also increases and this
represents a barrier when designing adders of very large word sizes, where the routing demand may lead to very
long wires and buffer requirements [18-27].

S. A. H. Ejtahed et. al. proposed a new structure design of superset adder which comes with three major
contributions. First, the design preserves the best points of performance while it reduces the others with the gain of
total area reduction. Second, the MUX-block is removed and ROM control logic is introduced for topology selection
schemas. Third, the building blocks of the main core including DOT and Semi- DOT are designed with Source
Coupled Logic (SCL) [28]. Figure 5 shows the graph representation of a 16 bit Han-Carlson adder.

Fig 5. Graph representation of prefix computation in a 16-bit Han-Carlson adder [18]

2.4. Hybrid Han Carlson Adder

Sreenivaas Muthyala Sudhakar et. al. explored a second type of Han-Carlson design which was having two
Brent-Kung stages each at the beginning and at the end. In the middle stages it utilized Kogge-Stone concept. It was
designed at the RTL level using Verilog and were compiled using Synopsys VCS. The simulations were done using
the FreePDK45nm academic library. Primetime and Synopsys Design Vision are used to perform timing and area
analysis. The power analysis used the Power Compiler tool from Synopsys. Hybrid Han-Carlson adders showed a
slightly higher delay (roughly 10%) than the Han-Carlson adders. But the complexity is reduced around 10% to
18%. The reduced complexity is quite significant and hence provides a reasonable trade-off for the construction of
adders. In addition, in the era of “interconnect driven design” the wires offer a greater delay when compared to the
gates. This when coupled to the increasing difference in length of wires in the Han-Carlson and Hybrid Han-Carlson
design for larger words may lead to a situation when the Han-Carlson design may become slower than the Hybrid
Han-Carlson design due to the dominant wire delay [29-30]. The graph representation of a 32 bit Hybrid Han-
Carlson adder is given in Fig. 6.
Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009 1007

Fig 6. Graph representation of prefix computation in a 32-bit Hybrid Han-Carlson adder [18]

Table 1. Comparison of Han Carlson and Hybrid Han Carlson Adders [20]
Parameters Han Carlson Hybrid Han
Adder Carlson Adder
Number of bits 32 32
Number of gates 940 855
Area(× 10-10 m2) 1299.02 1147.44
Power 445.98 µW 405.23 µW

Table I shows the comparison of 32 bit Han Carlson adder and Hybrid Han Carlson adder in terms of number of
gates, area and power. Figure 7, 8 and 9 presents a bar chart comparison description of the two adders.

Fig 7. Gate count comparison Fig 8. Area comparison


1008 Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009

Fig 9. Power comparison

3. Discussion

There are certain limitations found in the adder structures which are reviewed in this paper. In Kogge Stone adder
the node count is high which implies more area. It also consumes more power due to the presence of more nodes.
The power consumed can be reduced in Brent Kung adder at the cost of speed. It has high logic depth which
contributes to longer computation time. But the number of nodes is decreased and hence the area occupied becomes
less. Han Carlson Adder which is a combination of Kogge Stone Adder and Brent Kung Adder has an additional
stage compared to Kogge Stone Adder. It is slower than Kogge Stone Adder and occupies more area than Brent
Kung Adder. In Hybrid Han Carlson even though the number of prefix operation is less and the area occupied is
less, the delay is more compared to Han Carlson Adder.

4. Conclusion

In this survey an overview of different Parallel Prefix Adders has been discussed emphasizing the structure of the
adders and the performance parameters. The review mainly focused on four different Carry Tree Adders namely
Kogge Stone adder, Brent Kung Adder, Han Carlson Adder and Hybrid Han Carlson Adder. It is evident from the
work that when we try to achieve improvement in some performance parameters, some other has to be
compromised. The trade off between these performance parameters is still a major concern. New techniques have to
be designed and developed to achieve an improvement in all the key performance parameters.

References

[1] Konstantinos Vitoroulis and Asim J. Al-Khalili, “Performance of Parallel Prefix Adders implemented with FPGA technology”, IEEE
Northeast Workshop on Circuits and Systems (NEWCAS), Canada, pp 499 – 501, Aug 2007..
[2] K.Nehru, A.Shanmugam and S.Vadivel, “Design of 64-Bit Low Power Parallel Prefix VLSI Adder for High Speed Arithmetic Circuits”,
Proc. IEEE Int. Conference on Computing, Communication and Applications (ICCCA), 2012.
[3] P. Kogge and H. Stone, “A parallel algorithm for the efficient solution of a general class of recurrence relation”, IEEE transactions on
computers, vol. C-22, no.8, pp.786-793, Aug 1973.
[4] Sunil M, Ankith R D, Manjunatha G D and Premananda B S, “Design and Implementation of Faster Parallel Prefix Kogge-Stone adder”,
Int. Journal of Electrical and Electronic Engineering & Telecommunications, Vol.4, No.1, pp 116-121, 2015.
[5] Raghumanohar Adusumilli and Vinod Kumar K, “Design and Implementation of a high speed 64 bit Kogge-Stone adder using Verilog
HDL”, Int. Journal of Electrical and Electronic Engineering & Telecommunications, Vol.3, No.1, pp 13 – 18, 2014.
Rakesh S / Materials Today: Proceedings 11 (2019) 1001–1009 1009

[6] Nurdiani Zamhari, Peter Voon, Kuryati Kipli, Kho Lee Chin and Maimun Huja Husin, “Comparison of Parallel Prefix Adder (PPA)”, Proc.
World Congress on Engineering 2012 (WCE 2012) Vol II, London, July 2012.
[7] David H. K. Hoe, Chris Martinez and Sri Jyothsna Vundavalli, “Design and Characterization of Parallel Prefix Adders using FPGAs”,
IEEE 43rd Southeastern Symposium on System Theory (SSST), USA, pp 168 – 172, March 2011.
[8] Jasmine Saini, Somya Agarwal and Aditi Kansal, “Performance, Analysis and Comparison of Digital Adders”, Proc. IEEE Int. Conference
on Advances in Computer Engineering and Applications (ICACEA), Ghaziabad, India, pp 80 – 84, 2015.
[9] Sudheer Kumar Yezerla and B Rajendra Naik, “Design and Estimation of delay, power and area for Parallel prefix adders”, Proc. IEEE
2014 RAECS UIET Panjab University, Chandigarh, March 2014.
[10] Shanil Mohamed.N and Siby T Y, “16-bit Velocious Fault Lenient Parallel Prefix Adder”, Proc. IEEE Int. Conference on Electronics,
Communication and Computational Engineering (ICECCE), Hosur, India, pp 8 – 11, November 2014.
[11] Soumya Banerjee and Wenjing Rao, “A General Design Framework for Sparse Parallel Prefix Adders”, IEEE Computer Society Annual
Symposium on VLSI, Bochum, Germany, pp 231 – 236, July 2017.
[12] Feng Liu, Fariborz F.F, Otmane Ait Mohamed, Gang Chen, Xiaoyu Song and Qingping Tan, “A Comparative Study of Parallel Prefix
Adders in FPGA Implementation of EAC”, IEEE 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools,
Patras, Greece, pp 281 – 285, August 2009.
[13] R.Brent and H.Kung, “A regular layout for parallel adders”, IEEE Transaction on Computers, vol. C-31, no.3, pp 260 – 264, March 1982.
[14] K.Babulu, Y.Gowthami,” Implementation and Performance Evaluation of Prefix Adders using FPGAs”, IOSR Journal of VLSI and Signal
Processing, Vol. 1, Iss.1, pp 51 - 57, 2012.
[15] Er. Aradhana Raju, Richi Patnaik, Ritto Kurian Babu and Purabi Mahato, “Parallel Prefix Adders- A Comparative Study For Fastest
Response”, Proc. IEEE International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, pp 1 – 6,
October 2016.
[16] Darjn Esposito, Davide De Caro and Antonio Giuseppe Maria Strollo, “Variable Latency Speculative Parallel Prefix Adders for Unsigned
and Signed Operands”, IEEE Transactions on Circuits and Systems—I: Regular Papers, Vol. 63, No. 8, pp 1200 – 1209, August 2016.
[17] Tackdon Han and David A. Carlson, “Fast Area - Efficient VLSI Adders”, IEEE 8th Symposium on Computer Arithmetic (ARITH), Italy,
pp 49 – 56, May 1987.
[18] Sreenivaas Muthyala Sudhakar, Kumar P. Chidambaram and Earl E. Swartzlander Jr., “Hybrid Han-Carlson Adder” , IEEE, pp 818 – 821,
2012.
[19] Darjn Esposito, Davide De Caro, Ettore Napoli, Nicola Petra and Antonio Giuseppe Maria Strollo,, “Variable Latency Speculative Han-
Carlson Adder”, IEEE Transactions on Circuits and Systems—I: Regular Papers, Vol. 62, No.5, pp 1353 – 1361, May 2015.
[20] Darjn Esposito, Davide De Caro, Michele De Martino and Antonio Giuseppe Maria Strollo,, “Variable Latency Speculative Han-Carlson
Adder Topologies”, Proc. IEEE Conference on Ph.D Research in Microelectronics and Electronics (PRIME) , Glasgow, UK, pp 45 – 48,
July 2015.
[21] Gayathri.G, Raju S.S and Suresh.S, “ Parallel Prefix Speculative Han Carlson Adder”, IOSR Journal of Electronics and Communication
Engineering, Vol. 11, No. 3, pp 38 – 43, Jun 2016.
[22] C.Dhanalakshmi and C. Manjula, “An Area Efficient, Low Power and High Speed Speculative Han-Carlson Adder’’, International Journal
of Innovative Research in Science, Engineering and Technology, Volume 5, Special Issue 2, pp 110 – 117, March 2016
[23] S. Sri Katyayani, Dr.M.Chandramohan Reddy and Murali.K, “Design of Efficient Han-Carlson-Adder”, International Journal of
Innovations in Engineering and Technology, Special issue on ETiCE, pp 69 – 75, 2016.
[24] Deepa Yagain, Vijaya Krishna A, and Akansha Baliga, “Design of High-Speed Adders for Efficient Digital Design Blocks”, International
Scholarly Research Network, Vol. 2012, Article ID 253742, pp 1-9, 2012.
[25] K. Kaarthik and C. Vivek, “Hybrid Han Carlson Adder Architecture for Reducing Power and Delay”, Middle-East Journal of Scientific
Research, 24(Special Issue on Innovations in Information, Embedded and Communication Systems), pp 308-313, 2016.
[26] P.Ramanathan and.P.T.Vanathi, “Hybrid Prefix Adder Architecture for Minimizing the Power Delay Product”, World Academy of Science,
Engineering and Technology, 28, PP 1057 – 1061, 2009.
[27] Andrew Beaumont-Smith and Cheng-Chew Lim, “Parallel Prefix Adder Design”, Proc. IEEE Symposium on Computer Arithmetic, Vail,
USA, pp 218 – 225, June 2001.
[28] S. A. H. Ejtahed and M. B. Ghaznavi-Ghoushchi, “Design and Implementation of a Power and Area Optimized Reconfigurable Superset
Parallel Prefix Adder”, IEEE 24th Iranian Conference on Electrical Engineering (ICEE), Shiran, Iran, pp 1655 – 1660, May 2016.
[29] Swapna Gedam, Pravin Zode and Pradnya Zode, “FPGA Implementation of Hybrid Han-Carlson Adder”, Proc. IEEE Int. Conference on
Devices, Circuits and Systems (ICDCS), 2014.
[30] M. Gomathi, “A Parallel Algorithm for Design of Hybrid Modular Parallel Prefix Adder”, International Journal of Advanced Research
Trends in Engineering and Technology (IJARTET) Vol. 3, Special Issue 2, pp 1164 – 1168, March 2016.

You might also like