Fast Redundant Binary Partial Product Generators for Booth Multiplication
electronics seminars Active In SP Posts: 694 Joined: Nov 2009 
09012010, 05:45 PM
Fast Redundant Binary Partial Product Generators for Booth Multiplication Bijoy Jose and Damu Radhakrishnan Department of Electrical and Computer Engineering State University of New York New Paltz, New York, USA 12561 bijoyaj@gmail.com, damu@engr.newpaltz.edu Abstractâ€ The use of signeddigit number systems in arithmetic circuits has the advantage of constant time addition irrespective of word length. In this paper, we present the design of a binary signeddigit partial product generator, which expresses each normal binary operand in oneâ„¢s complement form with an extra bit denoting the sign bit of the operand. The carry free nature of RBAs is exploited to add the extra bits with the partial products, without using additional adder stages. This technique allows RB encoding to be used in multipliers with operand widths which are perfect powers of two, without increasing the delay of partial product accumulation. The proposed partial product generator achieves the highest reduction in the number of partial products for a radix4 multiplier (78%), by combining advantages of RB encoding with RB addition. The proposed partial product generator together with a set of fast redundant binary adder stages configured in a binary tree fashion can result in the design of high performance multipliers. I. INTRODUCTION The fast growth in computing power has enabled us to develop sophisticated algorithms and perform complicated functions. It also resulted in a requirement to increase processing power even more, in order to do these operations more efficiently. In this regard, researchers have always been trying to develop faster and faster computational hardware. This demands the development of faster arithmetic circuits, such as adders, multipliers etc. Many adder and multiplier designs have been proposed in the past to satisfy various objectives such as speed, area and power. Even though earlier designs focused more on minimizing the Silicon area, the focus has shifted more towards speed and power in the last decade. The use of signed digit arithmetic for the design of adders is worth mentioning in this regard. Signed digit arithmetic has its property, which makes it possible to have constant time addition, thereby eliminating the perpetual problem of carry propagation during addition. Redundant binary (RB) number systems are used to perform signed digit arithmetic in binary [1]. In an RB number system with the digit set [1, 0, 1], each digit is coded with two bits. An RB digit Xi can therefore be represented by the bit pair (Xi , Xi +). The constant time addition property has spurred interest among researchers to perform addition and multiplication of normal binary (NB) operands in RB number systems. Special conversion circuits have also been designed to enable fast NB to RB and RB to NB conversion. II. MULTIPLIERS A multiplier essentially consists of two operands, a multiplicand ËœYâ„¢ and a multiplier ËœXâ„¢, and produces a product ËœPâ„¢. In a conventional multiplier, a number of partial products are formed first by multiplying the multiplicand with each bit of the multiplier. These partial products are then added together to generate the product ËœPâ„¢. In short, we can break down multiplication into two parts, namely partial product generation and partial product accumulation. Speeding up multiplication therefore must aim (i) speeding up partial product generation (PPG), (ii) reduce the number of partial products, (iii) speeding up partial product summation or (iv) a combination of one or more of the above. An algorithm to reduce the number of partial products was first proposed by Booth [2]. The Booth algorithm was based on the fact that only fewer partial products need to be generated for a multiplier consisting of consecutive ones or zeros. An efficient algorithm was later proposed by McSorley, known as the modified Booth algorithm or radix 4 Booth multiplication, which reduced the number of partial products by half [3]. The next step involves the addition of these partial products in the fastest manner possible. Speeding up the addition of partial products required faster adders. The major problem with fast addition was carry propagation. This spurred lot of interest in the design of arithmetic circuits using RB number system. One of the early works in this field is a highspeed multiplier using a tree of redundant binary adders by Harata et al. [4]. The addition of two RB numbers was performed in constant time in a twostep method. A Redundant Binary Adder (RBA) cell was defined and a 298 16x16bit highspeed multiplier using 2â„¢s complement binary numbers was implemented. The NB digits were converted into RB during partial product generation. A 50% reduction was achieved in the number of partial products. Further reduction in the number of partial products was obtained by Makino et al. [5] by using radix4 Booth algorithm and by pairing the partial products to form single ones. Furthermore, a modified version of an efficient RB to NB converter proposed by Yen et al. [6] was used in their design. A new RBA cell was also defined by Makino et al. to attain high speed addition. Besli et al. used the above RBA cell to design a 54x54bit multiplier based on a RBSD radix 8 Booth encoder [7,8]. The number of partial products was reduced to 66% in their design. A 54x54bit radix64 multiplier using the least number of transistors was designed by Lee et al., which expressed each partial product as a combination of y, 2y and 3y, where y is the multiplicand [9]. The computation of 3y created an overhead in the partial product generation block. Among the available multiplier designs, Makino multiplier achieved the greatest reduction in the number of partial products using their Redundant Binary Partial Product Generator (RBPPG) [5]. In their design, the sum of two partial products A and B represented in twoâ„¢s complement form is converted to RB notation by a simple grouping of the bits in A and B. This is illustrated as follows: A B A B 1 (1) where B is the oneâ„¢s complement of B . Equation 1 can be rewritten as: A B (A,B) 1 (2) where (A,B) A B. Using the RB Encoding 1 given in Table I, A B (A,B) (0,1) (3) One of the objectives of the above approach was to avoid the time consuming twoâ„¢s complement addition. The sum was obtained in constant time, at the expense of a few inverters, with the added advantage of converting the result in RB form. But this does not cancel the delay in obtaining negative NB partial products from the Booth encoder in twoâ„¢s complement form. The negative NB partial products are obtained by adding 1 to the oneâ„¢s complement of the numbers. The resulting carry propagation introduces extra delay during partial product generation. A second delaying factor is due to the nonunique coding of Ëœ0â„¢ as seen in Table I. Because of this, every (1,1) pair generated has to be reconverted back to a (0,0) pair in the Makino RBA. Kim et al. [10] offered a solution for the negative NB representation by the use of an alternate RB encoding and an errorcorrection word. This alternate RB Encoding 2 is shown in Table I, which defines (A, B) as (A,B) A B . TABLE I. REDUNDANT BINARY ENCODING RB digit Encoding 1 Encoding 2 Xi (Xi +, Xi ) (Xi , Xi +) 0 (0,0) (0,1) 1 (1,0) (1,1) 1 (0,1) (0,0) 0 (1,1) (1,0) From (1), A B 1 A B , and hence A B (A,B) 1 (4) Thus, by pairing up NB operands and by including a Ëœ1â„¢ in the errorcorrecting word at the corresponding position for each RB partial product, the sum of A and B is obtained. NB operands were expressed in oneâ„¢s complement format, which requires an additional 1 to be added into the errorcorrecting word for every negative NB operand. The errorcorrecting word was of the form Â¦0X0Y0X0Y, where X {0, 1} and Y {0, 1}. Both X and Y are functions of RB and Booth recoding terms. Although the above method eliminated the carry propagate operation, it added an extra errorcorrection block into the partial product reduction tree. Also, the errorcorrection method described in this multiplier put restrictions in the number of bits that can be multiplied. For a 64bit multiplier, there will be more than 16 RB partial products including the errorcorrection term. This will require the use of an extra stage of RBAs, thereby significantly increasing the multiplication time. In general, if the number of partial products were perfect powers of 2, the multiplier will be inefficient. III. FAST PARTIAL PRODUCT GENERATOR The proposed partial product generator generates RB partial products, without any carry propagation delay or any additional hardware. For a multiplicand Ëœyâ„¢ the radix4 Booth encoder will have five different NB partial products{2y,y,0, y,2y}. Instead of generating â€œ2y and â€œy in twoâ„¢s complement form the multiplier retains the partial products in their oneâ„¢s complement form and introduces an extra bit Ëœ1â„¢ along with the partial products. The NB partial product â€œy obtained from Booth encoder is expressed as (y,1) , where y is the oneâ„¢s complement of y. The set of partial products obtained from Booth encoder is represented as {(2y,1),( y,1), (0,0), (y,0), (2y,0)}. A NB partial product A can be represented as A (A*a) (5) 299 TABLE II. ENCODING FOR NB PARTIAL PRODUCTS A B a b Z + + 0 0 1 +  0 1 0  + 1 0 0   1 1 1 where A* A and a = 1, when A is negative and A* A and a = 0, when A is positive. The sum of two NB partial products A and B can now be expressed as A B (A*a) (B*b) A* B * a b (A* B *) 1a b Using the RB Encoding 2 shown in Table I, the above equation can be expressed as A B (A*,B*) 1a b (6) For different positive and negative numbers A and B, the values of a and b will be chosen according to Table II. It can be observed that a and b are nothing but the sign bits of A and B respectively. If Z = a + b  1, Equation 6 can be modified as A B (A*,B*) Z (7) where Z can be coded according to Table II. The extra RB digit from each RB operand forms an extra operand, which can be fed into the next partial product accumulation stage as shown in Makino [5]. This correctionword will be having the format Â¦0Z000Z000Z, where Z {1, 0, 1}. The addition of two NB partial products A = 10 and B = 20 according to Table II encoding is shown in Fig. 1. The two partial products are grouped along with the extra bit. In this case both numbers are expressed in their oneâ„¢s complement representations. The extra bits are Ëœ1â„¢ for both, and are shown separately in Fig. 1. The bit pair (A,B) is also shown in Fig 1. These bit pairs represent the sum A+B in RB notation. The equivalent RB number can be obtained using Encoding 2 in Table I and is shown at the bottom. The extra bit position is also assigned unit weight. The RB result obtained can be reconverted into its equivalent decimal value using a negative weight for the MSB bit. This results in the final sum of â€œ30. Figure 1. Example of RBPPG using oneâ„¢s complement arithmetic The above method avoids any kind of carry propagate operation during partial product generation, and simply expresses the partial products in oneâ„¢s complement NB format for a negative number. The extra bit for each NB partial product is same as the sign bit of each operand. Contrary to Kimâ„¢s technique [10], the correction bit Z is found directly from the grouping, instead of a combination of RB and Booth recoding terms. Also, the correction digit is limited to one per RB partial product when compared to one per NB partial product in earlier designs. The RBPPG does not use any gates (including inverters) for obtaining the corrected RB partial product. The comparison of various PPG blocks for a 54x54bit and 64x64bit multiplier is shown in Table III. The number of partial products after the encoding is shown for each multiplier. Our PPG design exhibits the highest amount of reduction among 54x54bit multipliers. The details regarding the number of partial products for 54bit multipliers was given in [5, 8, 10], whereas those for 64bit multipliers were computed by us. It may be noted that in the case of 64bit multipliers all the earlier multiplier formats exceed the optimum number of partial products for a 4 stage partial product accumulator. TABLE III. NUMBER OF PARTIAL PRODUCTS IN DIFFERENT MULTIPLIER FORMATS PPG Design 54x54 bit Reduction (%) 64x64 bit Reduction (%) Besli [8] 18 33.3 22 34.3 Makino [5] 15 27.7 17 26.5 Kim [10] 15 27.7 17 26.5 Ours 14 25.9 16 25 300 Figure 2. 64bit multiplier architecture. IV. PARTIAL PRODUCT ACCUMULATION Partial product accumulation of previous multipliers, Makino [5] and Kim [10] were similar in structure. Our correctionword can replace its counterpart in Kim [10], thereby reducing the effort to combine RB and Booth recoding terms. Another method of improving partial product accumulation would be to exploit the carryfree addition scheme of the RBA. The carry propagation in RBAs is limited to 2 RBA cells for Encoding 2, and 3 RBA cells for Encoding 1. Since the correction bits have three zeros in between them, they can be fed into the carry digit of an RBA array without causing any carry propagation. This will avoid the errorcorrecting operand required in previous multipliers. The block diagram of a 64bit multiplier using the new technique is shown in Fig. 2. Fast RBAs defined in Jose et al. [11] can be used for RB addition using the alternate Encoding 2 in Table I. Our design could also eliminate the errorcorrection block used in the partial product accumulator [10]. This will enable us to expand the number of bits in the multiplier to 64 bits, while keeping the number of adder stages to four. Similar methods can be followed in the design of any multiplier having the number of partial products as perfect powers of two (64x64, 32x32, 16x16, etc). This design will accommodate RB encoding in such multipliers while enjoying the benefits of both lesser number of partial products with optimum number of adder stages. V. CONCLUSIONS A new partial product generation technique for Booth multipliers has been proposed by eliminating the carry propagation delay encountered in generating the negative partial products in twoâ„¢s complement form. This is achieved by expressing the partial products in oneâ„¢s complement form together with an extra bit. This technique replaces the error correcting word in earlier designs with one error digit per RB operand, which can be added along with the RBA tree. The multiplier width can now be extended to perfect powers of 2, without increasing the number of stages of RBAs in the partial product accumulation stage. The use of radix4 Booth encoding combined with our technique results in 78% reduction in the number of partial products generated. The selection of the particular RB encoding also allows us to take advantage of a faster RBA cell, thereby speeding up multiplication from all fronts. REFERENCES [1] I. Koren, Computer Arithmetic Algorithm, Prentice Hall: New York, 1993. [2] A. D. Booth, A signed binary multiplication technique, Quarterly Journal of Mechanics and Applied Mathematics, vol. 4, Part 2 , pp. 236240, 1951. [3] O. L. McSorley, Highspeed arithmetic in binary computers, Proc. of Institute of Radio Engineers (IRE), vol. 49, no. 1, pp. 6791, 1961. [4] Y. Harata, Y. Nakamura, H. Nagase, M. Takigawa, and N. Takagi, A high speed multiplier using a redundant binary adder tree, IEEE J. SolidState Circuits, vol. sc22, pp. 2834. Feb. 1987. [5] H. Makino, Y. Nakase, H. Suzuki, H. Morinaka, H. Shinohara, and K. Mashiko, An 8.811s 54x54bit multiplier with high speed redundant binary architecture, IEEE J. Solidstate Circuits, vol. 31, no. 6, pp. 773783, June 1996. [6] S. M. Yen, C. S. Laih, C. H. Chen, and J. Y. Lee, An efficient redundantbinary number to binary number converter, IEEE Journal of SolidState Circuits, vol. 27, no. 1, pp. 109112, 1992. [7] N. Besli and R. G. Deshmukh, A novel redundant binary signeddigit (RBSD) Boothâ„¢s encoding, Proc. of IEEE Southeast Conference (SECON2002), pp. 426431, Columbia, SC, 2002. [8] N. Besli and R. G. Deshmukh, A 54x54bit multiplier with a new redundant binary Boothâ„¢s encoding, Proc. of Canadian Conf. on Electrical and Computer Engineering, vol. 2, pp. 597602, Winnipeg, MB, Canada, 2002. [9] S. Lee, S. Bae, and H. Park, A Compact Radix64 5454 CMOS Redundant Binary Parallel Multiplier, IEICE Trans. on Electronics, vol. E85C, no. 6, pp. 13421350, June 2002. [10] Y. Kim, B Song, J. Grosspietsch, and S. Gillig, A CarryFree 54bx54b multiplier using equivalent bit conversion algorithm, IEEE J. SolidState Circuits, vol. 36, no. 10, pp. 15381545, Oct. 2001. [11] B. Jose and D. Radhakrishnan, Delay optimized redundant binary adders, IEEE Proc. of 13th IEEE International Conf. on Electronics, Circuits and Systems (ICECS2006), pp. 514517, Nice, France, 2006. Use Search at http://topicideas.net/search.php wisely To Get Information About Project Topic and Seminar ideas with report/source code along pdf and ppt presenaion



