# Performance Comparison of High-Speed High-Order (n:2) and (n:3) CNFET-Based Compressors

Shima Mehrabi, Keivan Navi, and Omid Hashemipour

*Abstract*—Compressors, as one of the Processing Elements (PEs), are the fundamental building blocks for accumulating partial products during the multiplication process. In this paper various high speed high-order compressors such as 6:2 and 7:2 compressors are compared with 6:3 and 7:3 ones in terms of designs and hardware requirements. To evaluate their performance, they are analyzed and compared with respect to delay and Power-Delay Product (PDP). The comparison is accomplished using the recent implementation of Full-Adder cell design at transistor-level in Carbon Nanotube Field Effect Transistor (CNFET) technology. Simulations are carried out using Synopsys HSPICE with 32nm CNTFET technology and 0.65 supply voltage. The results of simulations demonstrate the superiority of the n:3structures in terms ofpropagation delay and powerconsumption around %41 and%8, respectively.

Index Terms—CNFET, compressor, counter, full adder, multiplier.

#### I. INTRODUCTION

Multiplication is inherently a slow operation as a largenumber of partial products are added to produce the product. In applications like digital signalprocessing, this delay is unacceptable, particularly in the context of ever increasing throughput requirements [1]. Since many studies have been accomplished on the implementation of fast and efficient Adders and Multipliers, known as the arithmetic building blocks of microprocessors and digital signal processors (DSPs), choosing the appropriate implementation techniques and technologies are two major approaches of today's VLSI circuit designs [2].

Fast multipliers are generally composed of three subfunctions: partial product generation, partial product accumulation, and carry-propagating addition [3], [4]. At the first step, Booth encodings are often used to reduce the number of partial products. A summation tree, which is called the Carry Save Adder (CSA), is used in the second sub-function to further reduce the partial products to two rows. The last step is normally fulfilled by a fast carry propagate adder, such as carry look-ahead adder or carry-skip adder [5].

To implementfast multipliers, variousarchitectures of Processing Elements (PEs) have been presented to perform arithmetic addition and multiplication. Compressors as one of the PEs are the fundamental building blocks which are being used for accumulating partial products during the multiplication process.

A compressor is a combinatorial device which is mostly used in multipliers to reduce the operands while adding terms of partial products. A typical (m:n) compressor takes mequally weighted input bits and produces n-bit binary number [6]. In other words, it counts the number of 1s in the input and outputs the binary count value. Note that the outputs of the compressor have different power-of-2 weights. The weight of the LSB of the compressor output is the same as the weight of each of the inputs, and the remaining bits have increasingly higher weights.

The simplest and the most widely used one is the 3:2 compressor(also known as a Full Adder cell) which has 3 inputs to be summed up and provides 2 outputs. Similarly, 4:2compressor can also be built from two cascaded 3:2 compressors [7].

In this paper, we analyze the two different high-speed high-order n:2 compressors versus n:3 compressors as their counterparts with the design and comparison at transistor-level in Carbon Nanotube Field Effect Transistor (CNFET) technology. Since all the architectures are based on the conventional design of the compressors with cascaded Full Adder cells, the recent implementations of Full Adder design with a custom transistor-level are employed[8].To evaluate the performance of all these compressors, the cascaded models are considered. Moreover they have been comprehensively compared at 0.65 supply voltage and 100MHz operation frequency.

The rest of the paper is organized as follows: in section II, conventional design and architecture of 6:2 and 7:2 compressors with 6:3 and 7:3 compressors are reviewed. Section IIIfocuses ontheir cascaded implementations.In section IV, experimental results, analyses and comparisons are presented and finally section V concludes the paper.

## II. PERFORMANCE COMPARISON OF N: 2 AND N: 3 COMPRESSORS IN PARALLEL MULTIPLIER

## A. Conventional Architectures: n:2 Compressors Versus n:3 Compressors

At present, the most widely used compressors are 3:2 and 4:2compressors. However both 3:2 and 4:2compressors are ideal for constructing regular structured Wallace tree with low complexity [9], but for the compression of a larger number of bits, higher order compressors are needed. Many researches show that the multipliers with high order compressors have betterperformance [10].

Conventional structures of 6:2 and 7:2 compressors are shown in Fig. 1 and Fig. 2 respectively [11].

Manuscript received May 20, 2013; July 5, 2013.

Shima Mehrabi is with Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran. (e-mail: sh.mehrabi@srbiau.ac.ir).

Keivan Navi is with Department of Electrical and Computer Engineering, University of California, Irvine, USA (e-mail: knavi@uci.edu).

Omid Hashemipour is with Faculty of Electrical and Computer Engineering, ShahidBeheshti University, G.C., Tehran, Iran (e-mail: hashemipour@sbu.ac.ir).



Fig. 1. Block diagram of 6:2 compressor.



Fig. 2. Block diagram of 7:2 compressor.

As it depicted in Fig.1and Fig.2, there are six and seven primary inputs with twocarry inputs ( $C_{in}$ )from column (k-1) and (k-2) for 6:2 and 7:2 compressors respectively [12]. They also generate two primary outputs, denoted by  $2^k$  and  $2^{k+1}$ , reflecting their weights and two outgoing carries ( $C_{out}$ ) $2^{k+1}$  and  $2^{k+2}$  to column, (k+1) and (k+2) respectively. 6:3 compressor essentially comprises of a combinational logic circuit with six inputs and three outputs. Similarly, 7:3 compressor with seven primary inputs comprises and three outputs. According to Fig.3, the conventional 6:3 and 7:3 compressors consist of three Full-Adder cells with one Half-Adder and four Full-Adder cells which are cascaded.



Fig. 2. Block diagrams of (a) 6:3 and (b)7:3 compressors.

As it obvious from the Fig. 2, there are no carry-in and

carry-out signals for 6:3 and 7:3 compressors. Therefore, this is the great advantage to reduced carry-in and carry-out signals which cause extra interconnections, more power dissipation, coupling effects, and routing difficulties. Moreover, the entire required hardware is computed. One Full Adder cell is eliminated in both 6:3 and 7:3 architectures when compared with their counterparts 6:2 and 7:2 compressors (Table I).

| Compressors | Full Adder | Half Adder |
|-------------|------------|------------|
| 6:2         | 4          | 1          |
| 6:3         | 3          | 1          |
| 7:2         | 5          | 0          |
| 7:3         | 4          | 0          |

B. Discussions: Performance Analysis on Cascade Models

6:2 and 7:2 compressors used for partial product reduction [10]. So, one row of 6:2 and 7:2 compressors can reduce 6 and 7 rows of partial products into two rows. Fig. 3 shows how 4 columns outof the 6 rows of partial product array are reduced by 4 6:2 compressors in  $16 \times 16$ -bit multiplication process.



Fig. 3. Partial product reduction.

Fig. 4 and Fig. 5 depict implementations of two cascaded 6:2 compressors and 7:2 compressors using 3:2 counters.



Fig. 4. Two cascaded 6:2 compressors in parallel manner.



Fig. 5. Two cascaded 6:2 compressors in parallel manner.

It is obvious that the most important thingthat effect on the performance of the second cascaded 6:2 compressor is all the carry-in signals must be valid if needed. Considering either the second primary output signal or the first outgoing carries signal ( $C_{out}$ ) of the first 6<sup>th</sup> column (Fig. 4) as the two input carry, delay increases in both cases

In this case, if all the inputs employed to both compressors at time  $\tau_0$ , the delay of second compressor is not just the delay of two Full Adder (FA) cells and one Half Adder (HA). Since the last FA cell of the second cascaded 6:2 compressor needs all the two carry-in signals at time  $\tau 2$ , they would be available at time  $\tau 3$ . So, the second cascaded compressors would tolerate delayed signals from its neighbor and the critical path of the second cascaded compressor is the delay of  $3Sum + 1C_{out}$ .

Accordingly, this is the same scenario for two cascaded 7:2 compressors. The mentioned problem for the two cascaded 6:3 and 7:3 compressors do not exist because there are no carry-in /carry-out signals to make dependency between their cascaded models (Fig.6).



Fig. 6. Conventional structure of (a) 6:3 and (b)7:3 compressors.

Table II exhibits a comparison of the maximum delays of all mentioned compressors.

| TABLE II: DELAY COMPARISON (N:2 COMPRESSORS VS. N:3 COMPRESSORS) |                                             |  |
|------------------------------------------------------------------|---------------------------------------------|--|
| COMPRESORS                                                       | Delay (Based-on Sum &Cout)                  |  |
| 6:2                                                              | $3 * t_{p.sum} + 1 * t_{p \ cout \ (HA)}$   |  |
| 6:3                                                              | $2 * t_{p \ sum} + 1 * t_{p \ cout \ (HA)}$ |  |
| 7:2                                                              | $3 * t_{p.sum} + 1 * t_{p \ cout \ (FA)}$   |  |
| 7:3                                                              | $2 * t_{p sum} + 1 * t_{p cout (FA)}$       |  |

### **III. SIMULATION RESULTS**

## A. Technology Constraints

Since dimensional down scaling of CMOS transistors is reaching its fundamental physical limits, various researches have been actively carried out to find an alternative way to continue following Moore's law [13]. Among many candidate emerging technologies, CNFET is a promising technology to be extended, due to three main reasons: First, the operation principle and the device structure are similar to current CMOS devices and it is possible to reuse the established CMOS design infrastructure. Second, it is also possible to reutilize CMOS fabrication process. The last but not the least is that CNFET has the best experimentally demonstrated device current carrying ability up to now [14].

Based on many reported technical literature, CNFETs have superior properties such as excellent current handling capabilities and high thermal conductivity [13]-[15]. Because of their miniaturized dimensions, they are implemented as a reliable switch with much less power than a silicon-based device to response of increasing sensitivity to voltage scaling variations in today's VLSI circuit designs. The unique feature which makes difference CNTFETs form MOSFETs is the threshold voltage which can be controlled by changing the chirality vector or the diameter of the CNTs [16]. This feature makes easier the design process of the VLSI circuits.

Moreover, simulation results confirm more improvement in performance metrics such as delay, power and Power-Delay-Product (PDP) over MOSFET-based gates. Additionally, excellent robustness to Process, Voltage, and Temperature (PVT) variations is obtained [17].

### B. Simulation-Based Performance Comparison

In this section, the two 6:2 and 7:2 compressors are analyzed and compared with the two 6:3 and 7:3 compressors as their counterpart. Simulations are carried out using Synopsys HSPICE simulator tool with 32nm CMOS technology for CMOS circuits and the Compact SPICE Model [17,18] for 32nm CNTFET-based circuits, including all non-idealities. This standard model has been designed for unipolar, MOSFET-like CNFET devices, in which each transistor may have one or more CNTs. This model also considers Schottky Barrier Effects, Parasitics, including CNT, Drain/Source, and Gate resistances and capacitances and CNT Charge Screening Effects. The parameters of the CNFET model and their values, with brief descriptions, are shown in Table III.

| TABLE III: CNFET MODEL PARAMETERS |                                                                       |        |  |
|-----------------------------------|-----------------------------------------------------------------------|--------|--|
| Parameter                         | Description                                                           | Value  |  |
| L <sub>ch</sub>                   | Physical channel length                                               | 32nm   |  |
| L <sub>geff</sub>                 | The mean free path in the intrinsic<br>CNT channel                    | 100nm  |  |
| L <sub>ss</sub>                   | The length of doped CNT source-side extension region                  | 32nm   |  |
| L <sub>dd</sub>                   | The length of doped CNT drain-side extension region                   | 32nm   |  |
| K <sub>gate</sub>                 | The dielectric constant of high-k top<br>gate dielectric material     | 16     |  |
| T <sub>ox</sub>                   | The thickness of high-k top gate<br>dielectric material               | 4nm    |  |
| $C_{sub}$                         | The coupling capacitance between the channel region and the substrate | 20pF/m |  |
| Efi                               | The Fermi level of the doped S/D tube                                 | 6eV    |  |

Simulation results, shown in Table IV,ease the comprehensive comparisons. All the circuits are simulated at 0.65 supply voltages, at 100 MHz operating frequencies and with 2fF output loads. For all the structures, the unique input and output are employed.

| TABLE IV: SIMULATION      | N RESULTS (32 NM CNFET) |  |  |  |
|---------------------------|-------------------------|--|--|--|
| V <sub>DD</sub>           | 0.65V                   |  |  |  |
| Delay 10 <sup>-11</sup> s |                         |  |  |  |
| 6:2 Comp.                 | 18.72                   |  |  |  |
| 7:2 Comp.                 | 19.12                   |  |  |  |
| 6:3 Comp.                 | 10.84                   |  |  |  |
| 7:3 Comp.                 | 11.25                   |  |  |  |
| Power 10 <sup>-7</sup> W  |                         |  |  |  |
| 6:2 Comp.                 | 3.14                    |  |  |  |
| 7:2 Comp.                 | 3.69                    |  |  |  |
| 6:3 Comp.                 | 2.88                    |  |  |  |
| 7:3 Comp.                 | 3.27                    |  |  |  |
| PDP 10 <sup>-17</sup> J   |                         |  |  |  |
| 6:2 Comp.                 | 5.87                    |  |  |  |
| 7:2 Comp.                 | 7.05                    |  |  |  |
| 6:3 Comp.                 | 3.12                    |  |  |  |
| 7:3 Comp.                 | 3.68                    |  |  |  |

## IV. CONCLUSION

In this paper, we have carried out a comprehensive analysis and comparison between two different cascaded n:2 and n:3 compressors to analyze their performance in parallel manner at the transistor level, including new CNFET-based full adder cell design. Results of the comprehensive experiments, demonstrate considerable improvements using cascaded n:3 compressors in terms of delay, power consumption and PDP in comparison with n:2 compressors for designing fast parallel multipliers.

#### REFERENCES

- M. Mehta, V. Parmer, and E. Swartzlander, "High-speed Multiplier Design Using Multi-Input Counter and Compressor Circuits," in *Proc.* 10<sup>th</sup> IEEE Symposium on Computer Arithmetic, USA, 1991, pp. 43-50.
- [2] S.Mehrabi, K. Navi, and O. Hashemipour, "Performance Analysis and Simulation of Two Different Architectures of (6:3) and (7:3) Compressors Based on Carbon Nano-Tube Field Effect Transistors," in *Proc. 5<sup>th</sup> IEEE International Nanoelectronics Conference*, Singapour, 2013, pp. 322-325.
- [3] K. Prasad and K. K. Parhi, "Low-Power 4-2 and 5-2 Compressors," in Proc. the 35<sup>th</sup> IEEE Asilomar Conf. on Signals, Systems and Computers, vol. 1, 2001, pp. 129-133.
- [4] A. M. Shams and M. A. Bayoumi, "A structured approach for designing low-power adders," in *Proc. the 31<sup>st</sup> IEEE Asilomar Conf. on Signals, Systems and Computers*, vol. 1, 1997, pp. 757-761.
- [5] J. Gu and C. H. Chang, "Low voltage, Low power (5:2) compressor cell for fast arithmetic circuits," in *Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing*, vol. 2, 2003, pp. 661-664.
- [6] S. B Sukhavasi, S. B Sukhasavi, V. B Madivada, H. Khan, and S. R S. Kalavakolanu, "Implementation of low power parallel compressor for

multiplier using self resetting logic," International Journal of Computer Applications, vol. 47, no. 3, June, 2012.

- [7] P. D. Gopineedi, H. Thapliyal, M. B. Srinivas, and H. R. Arabnia, "Novel and Efficient 4: 2 and 5: 2 Compressors with Minimum Number of Transistors Designed for Low-Power Operations," in *Proc.*, *ESA*, 2006, pp. 160-168.
- [8] S. Mehrabi, R. F. Mirzaee, M. H. Moayeri, K. Navi, and O. Hahemipour, "CNFET-Based Design of Energy-Efficient Symmetric Three-Input XOR and Full Adder Circuits," Accepted in AJSE Journal, Springer, 2013.
- [9] G. Cho, Y. B. Kim, and F. Lombardi, "Assessment of CNTFET Based Circuit Performance and Robustness to PVT Variations," in *Proc. 52<sup>nd</sup> IEEE International Midwest Symposium on Circuits and Systems*, vol. 06, 2009, pp. 1106-1109.
- [10] W. Ma and S. H. Li, "A new high compression compressor for large multiplier," in Proc. 9th International Conference on Solid-State and Integrated-Circuit Technology, 2008, pp. 1877-1880.
- [11] A. Akoushideh, A. Najafi, and B. Mazloom-nezhad Maybodi, "Moodified Architecture for 27:2 Compressors," *Canadian Journal on Electrical and Electronics Engineering*, vol. 3, no. 8, October, 2012.
- [12] O. Kavehie, A. Mirbaha, N. Dadkhahi, and K. Navi, "Novel Architecture for IEEE-754 Standard," presented at the Information and Communication Technologies, ICTTA '06. 2nd, 2006.
- [13] Y. B. Kim, "Integrated Circuit Design Based on Carbon Nanotube Field Effect Transistor," *Transaction of Electrical and Electronic Material*, vol. 12, no. 5, pp. 175-188, 2011.
- [14] Y. B. Kim, Y. B. Kim, and F. Lombardi, "A novel design methodology to optimize the speed and power of the CNTFET circuits," in *Proc. IEEE International Midwest Symposium on Circuits and Systems, Cancun, Mexico*, 2009, pp. 1130-1133.
- [15] P. G. Collins and P. Avouris, "Nanotubes for Electronics", *Scientific American*, pp. 62-69, 2000.
- [16] S. Lin, Y. B. Kim, and F. Lombardi, "A Novel CNTFET Based Ternary Logic Gate Design," in *Proc. IEEE International Midwest Symposium on Circuits and Systems, Cancun, Mexico*, 2009, pp. 435-438.
- [17] J. Dengand and H. S. P. Wong, "A Compact SPICE Model for Carbon-Nanotube Field-Effect Transistors Including Nonidealities and Its Application—Part I: Model of the Intrinsic Channel Region," *IEEE Trans. on Electron Devices*, vol. 54, no. 12, pp. 3186-3194, 2007.
- [18] J. Deng and H. S. P. Wong, "A Compact SPICE Model for Carbon-Nanotube Field-Effect Transistors Including Nonidealities and Its Application—Part II: Full Device Model and Circuit Performance Benchmarking," *IEEE Trans. on Electron Devices*, vol. 54, no. 12, 2007, pp. 3195-3205.



Shima Mehrabi was born in Tehran, Iran, in 1981. She received the Bachelor's degree in Hardware Engineering from the South Tehran Branch of Islamic Azad University, Tehran, Iran, in 2004 and the Master's degree in Computer Architecture from Science and Research Branchof Islamic Azad University, Tehran, Iran, in 2007. She received her Ph.D. degree in April 2013 from the same university. Her research interests

mainly focus on Nanoelectronics with emphasis on CNFET, VLSI implementation of arithmetic logic circuits, Mixed-mode circuits design and Nano-FPGA



Keivan Navi received his M.Sc. degree in electronics engineering from Sharif University of Technology, Tehran, Iran in 1990. He also received his Ph.D. degree in computer architecture from Paris XI University, Paris, France, in 1995. He is currently associate professor in Faculty of Electrical and Computer Engineering of ShahidBeheshti University. His research interests include Nanoelectronics with

emphasis on CNFET, QCA and SET, Computer Arithmetic, Interconnection Network Design and Quantum Computing and cryptography



**Omid Hashemipour** received his B.S. degree in 1985, M.S. degree in 1987, Ph.D. in 1991 in electrical engineering all received from university of Arkansas at Fayetteville USA. From 1991, he is with the Electrical and Computer engineering Faculty at Shahid Beheshti University, G.C., Tehran, Iran as an associate professor. His research interest includes Low Power, Low Voltage Analog and Digital Integrated Circuits