Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

75 results about "Carry-save adder" patented technology

A carry-save adder is a type of digital adder, used in computer microarchitecture to compute the sum of three or more n-bit numbers in binary. It differs from other digital adders in that it outputs two numbers of the same dimensions as the inputs, one which is a sequence of partial sum bits and another which is a sequence of carry bits.

Floating point multiply-accumulate unit

A floating point unit 10 provides a multiply-accumulate operation to determine a result B+(A*C). The multiplier 20 takes several processing cycles to determine the product (A*C). Whilst the multiplier 20 and its subsequent carry-save-adder 26 operate, an aligned value B' of the addend B is generated by an alignment-shifter 34. The aligned-addend B' may only partially overlap with the product (A*C) to which it is to be added using an adder 44. Any high-order-portion HOP of the aligned-addend B' that does not overlap with the product (A*C) must be subsequently concatenated with the output of the adder 44 that sums the product (A*C) with the overlapping portion of the aligned-addend B'. If the sum performed by the adder 44 generates a carry then it is an incremented version IHOP of the high-order-portion that should be concatenated with the output of the adder 44. This incremented-high-order-portion is generated by the adder 44 during otherwise idle processing cycles present due to the multiplier 20 operating over multiple cycles.
Owner:ARM LTD

Processor which Implements Fused and Unfused Multiply-Add Instructions in a Pipelined Manner

Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused / unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
Owner:ORACLE INT CORP

Microarchitecture of an arithmetic unit

The microarchitecture of the arithmetic unit includes two cascaded N bit adders to provide an N bits result in an accumulator. The arithmetic unit also includes a carry save adder, followed by an adder, which, along with the accumulator, are extended to N+1 bits. A circuit for determining the output carry value associated with the result is also provided.
Owner:STMICROELECTRONICS SRL

Logic cell supporting addition of three binary words

Logic circuits that support the addition of three binary numbers using hardwired adders are described. In one embodiment, this is accomplished by using a 3:2 compressor (i.e., a Carry Save Adder method), using hardwired adders to add the sums and carrys produced by the 3:2 compression, and sharing carrys data calculated in one logic element (“LE”) with the following LE. In such an embodiment, with the exception of the first and last LEs in a logic array block (“LAB”), each LE in effect lends one look-up table (“LUT”) to the LE below (i.e., the following LE) and borrows one LUT from the LE above (i.e., the previous LE). The LUT being lent or borrowed is one that implements the carry function in the 3:2 compressor model. In another aspect, an embodiment of the present invention provides LEs that include selectors to select signals corresponding to the addition of three binary numbers mode.
Owner:ALTERA CORP

Shared integer, floating point, polynomial, and vector multiplier

A multiplier for performing multiple types of multiplication including integer, floating point, vector, and polynomial multiplication. The multiplier includes a modified booth encoder within the multiplier and unified circuitry to perform the various types of multiplication. A carry save adder tree is modified to route sum outputs to one part of the tree and to route carry outputs to another part of the tree. The carry save adder tree is also organized into multiple carry save adder trees to perform vector multiplication.
Owner:APPLE INC

Method and apparatus for implementing processor instructions for accelerating public-key cryptography

In response to executing an arithmetic instruction, a first number is multiplied by a second number, and a partial result from a previously executed single arithmetic instruction is fed back from a first carry save adder structure generating high order bits of the current arithmetic instruction to a second carry save adder tree structure being utilized to generate low order bits of the current arithmetic instruction to generate a result that represents the first number multiplied by the second number summed with the high order bits from the previously executed arithmetic instruction. Execution of the arithmetic instruction may instead generate a result that represents the first number multiplied by the second number summed with the partial result and also summed with a third number, the third number being fed to the carry save adder tree structure.
Owner:ORACLE INT CORP

Modular-multiplication computing unit and information processing unit

InactiveUS20060008080A1Shorten operation timeReduction of the operation time without increasing circuit sizeDigital data processing detailsSecret communicationInformation processingBinary multiplier
The bit strings of multipliers B and N are converted through the use of the Booth's algorithm in units composed of a predetermined number of bits and the operation of A×B+u×N is executed by a carry save adder using the value of an integral multiple of multiplicand A corresponding to the multiplication result of the values of the converted multiplier B and multiplicand A and also the value of an integral multiple of multiplicand u corresponding to the multiplication result of the values of the converted multiplier N and multiplicand u. The operation result of A×B+u×N supplied from the carry save adder are added to the operation result in the past of A×B+u×N through the use of an adder and the added result is supplied as the result of a modular-multiplication operation S=S+A×B+u×N.
Owner:NEC ELECTRONICS CORP +1

DSP block for implementing large multiplier on a programmable integrated circuit device

A programmable integrated circuit device includes a plurality of specialized processing blocks. Each specialized processing block may be small enough to occupy a single row of logic blocks. The specialized processing blocks may be located adjacent one another in different logic block rows, forming a column of adjacent specialized processing blocks. Each specialized processing block includes one or more multipliers based on carry-save adders whose outputs are combined using compressors. Chain-in and chain-out connections to the compressors allow adjacent specialized processing blocks to be cascaded to form arbitrarily large multipliers. Each specialized processing block also includes a carry-propagate adder, and the carry-propagate added in the final specialized processing block of the chain provides the final result. The size of the multiplication that may be performed is limited only by the number of specialized processing blocks in the programmable integrated circuit device.
Owner:ALTERA CORP

Apparatus and method for multiple pass extended precision floating point multiplication

A floating point multiplier circuit includes partial product generation logic configured to generate a plurality of partial products from multiplicand and multiplier values. The plurality of partial products corresponds to a first and second portion of the multiplier value during respective first and second partial product execution phases. The multiplier also includes a plurality of carry save adders configured to accumulate the plurality of partial products generated during the first and second partial product execution phases into a redundant product during respective first and second carry save adder execution phases. The multiplier further includes a first carry propagate adder coupled to the plurality of carry save adders and configured to reduce a first and second portion of the redundant product to a multiplicative product during respective first and second carry propagate adder phases. The first carry propagate adder phase begins after the second carry save adder execution phase completes.
Owner:GLOBALFOUNDRIES INC

System and method of bypassing unrounded results in a multiply-add pipeline unit

A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.
Owner:ORACLE INT CORP

Modular-multiplication computing unit and information-processing unit

InactiveUS20060008081A1Shorten operation timeReduction of the operation time without increasing the circuit sizeDigital data processing detailsSecret communicationInformation processingModularity
Either a multiplicand A or 0 is selected, depending on the value of multiplier B supplied in a unit composed of q bits through the use of selectors, and the selected result is provided, and either a multiplicand u or 0 is selected, depending on the value of multiplier N supplied in a unit composed of q bits through the use of selectors, and the selected result is provided. A carry save adder implements the operation of A×B+u×N making use of the values successively supplied from the selectors. To the operation result of A×B+u×N supplied from the carry save adder in a unit composed of q bits is added the operation result of A×B+u×N in the past supplied in a unit composed of q bits and the added result is issued as a result of the modular-multiplication operation S.
Owner:RENESAS ELECTRONICS CORP +1

System and method of bypassing unrounded results in a multiply-add pipeline unit

ActiveUS20120233234A1Data mergingRoundingMultiplexer
A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.
Owner:ORACLE INT CORP

Division and square root arithmetic unit

A division and square root arithmetic unit carries out a division operation of a higher radix and a square root extraction operation of a lower radix. A certain bit number (determined on the basis of a radix of an operation) of data selected from upper bits of the output of a carry save adder and the output of the adder are input to convert the data into twos complement representation data, and the twos complement representation data is shifted a certain bit number (determined on the basis of the radix of the operation) to use the shifted data for a partial remainder of the next digit. Hence, a large number of parts such as registers of a divisor and a partially extracted square root can be commonly used in a divider and a square root extractor to realize an effective and high performance arithmetic unit.
Owner:NEC CORP

High-radix multiplier-divider

The high-radix multiplier-divider provides a system and method utilizing an SRT digit recurrence algorithm that provides for simultaneous multiplication and division using a single recurrence relation. When A, B, D and Q are fractions (e.g., Q=0·q−1 q−2 . . . q−n), then the algorithm provides for computingS=ABDto yield a w-bit quotient Q and w-bit remainder R by: (1) determining the next quotient digit q−j using a quotient digit selection function; (2) generating the product q−jD; and (3) performing the triple addition of rRj−1, (−q−jD) andb-(j-1)(Ar)where R0=b−1Ar−1. The recurrence relation may be implemented with carry-save adders for computation using bitwise logical operators (AND, OR, XOR).
Owner:KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS

A 6-to-3 carry-save adder

A 6-to-3 carry-save binary adder is disclosed. The 6-to-3 carry-save adder includes a means for receiving six data inputs and a means for simultaneously adding the six data inputs to generate a first data output, a second data output, and a third data output. The first data output is a SUM output, the second data output is a CARRY-2 output, and the third data output is a CARRY-4 output.
Owner:IBM CORP

Modular (2<n>-3) multiplier

The invention belongs to the field of computers and integrated circuits and discloses a modular (2<n>-3) multiplier, which specifically comprises an n-bit binary multiplier (1), an n-bit carry save adder (CSA) compressor array (2), an n-bit binary adder (3) with carry input, a two-bit adder (4), a first n-bit binary adder (5) and a second n-bit binary adder (6). The modular (2<n>-3) multiplier reprocesses by using a result of binary multiplication as an operation number P, so multi-time modification of the conventional modular (2<n>-3) multiplier is changed into one-time modification, consumed resources of the modular (2<n>-3) multiplier are greatly reduced, and the operation speed of the modular (2<n>-3) multiplier is improved.
Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA

Processor which implements fused and unfused multiply-add instructions in a pipelined manner

Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused / unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
Owner:ORACLE INT CORP

Carry save adder and its system

A 4-to-2 carry store adder that reduces output sum and carry delays. The 4-to-2 store adder may include a lower-order full adder coupled to a higher-order full adder. The carry store adder may also include a logic unit coupled to the higher order full adder, wherein the logic unit is configured to generate a carry input to the higher order full adder, typically generated from a carry store adder at a previous stage. By generating the carry (input bit) in the current stage, rather than the previous stage, the delay of the input bit to the higher order full adder is reduced, and thus the delay of the output sum of the higher order full adder and the carry is reduced.
Owner:INT BUSINESS MASCH CORP

Base-16 fixed point divider based on carry-save adder

The invention discloses a base-16 fixed point divider based on a carry-save adder and belongs to the technical field of computer digital. The base-16 fixed point divider based on the carry-save adder comprises a detecting-relocating module, a quotient loop generating module, a quotient conversion module, a quotient / remainder adjusting module and an execution control module. According to the base-16 fixed point divider based on the carry-save adder, data is received and regularized through the detecting-relocating module and shifts leftwards. The received regularized data is used for loop operation, and loop iteration generates redundant data. The redundant form quotient value generated by the quotient loop generating module is received. Standard binary complementary form is converted by adoption of the carry-save form. Symbol same sign adjustment is conducted on the quotient result and the remainder result according to the RNS algorithm, and the quotient is adjusted. Finally, corresponding figure is shift rightward after the operation is realized, the result is input in a counter, and the loop execution times are calculated. The path delay of the one-bit generated by the base-16 fixed point divider based on the carry-save adder can be greatly shortened, one time of loop operation can generate four-bit quotient value due to the simple configuration of the divider, and the operating efficiency is improved.
Owner:INSPUR GROUP CO LTD

High-radix multiplier-divider

The high-radix multiplier-divider provides a system and method utilizing an SRT digit recurrence algorithm that provides for simultaneous multiplication and division using a single recurrence relation. When A, B, D and Q are fractions (e.g., Q=0·q−1 q−2 . . . q−n), then the algorithm provides for computingS=ABDto yield a w-bit quotient Q and w-bit remainder R by: (1) determining the next quotient digit q−j using a quotient digit selection function; (2) generating the product q−jD; and (3) performing the triple addition of rRj-1, (−q−jD) andb-(j-1)⁡(Ar)where R0=b−1Ar−1. The recurrence relation may be implemented with carry-save adders for computation using bitwise logical operators (AND, OR, XOR).
Owner:KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS

4:2 Carry save adder and 4:2 carry save adding method

Provided are a simplified 4:2 carry save adder (CSA) cell and a 4:2 carry save adding method. The 4:2 CSA cell is formed of an odd detector and first through sixth switches through logic optimization. The odd detector generates an XOR of the first through fourth input signals, outputs the XOR as an odd signal, generates an XOR of the first and second input signals, and outputs the XOR as a first XOR signal. The first switch outputs the third input signal as a carry output signal in response to the first XOR signal. The second switch outputs the first input signal as the carry output signal in response to an inverted first XOR signal. The third switch outputs the carry input signal as a carry signal in response to the odd signal. The fourth switch outputs the fourth input signal as the carry signal in response to an inverted odd signal. The fifth switch outputs an inverted carry input signal as a sum signal in response to the odd signal. The sixth switch outputs the carry input signal as the sum signal in response to the inverted odd signal.
Owner:SAMSUNG ELECTRONICS CO LTD

Montgomery modular multiplier and method thereof using carry save addition

A method of reducing power consumption and / or enhancing computation speed in the modulus multiplication operation of a Montgomery modulus multiplication module. A coding scheme reduces the need for an adder or memory element for obtaining multiple modulus values, and the use of carry save addition with carry propagation addition enhances the computational speed of the multiplication module.
Owner:SAMSUNG ELECTRONICS CO LTD

High-radix multiplier-divider

The high-radix multiplier-divider provides a system and method utilizing an SRT digit recurrence algorithm that provides for simultaneous multiplication and division using a single recurrence relation. When A, B, D and Q are fractions (e.g., Q=0.q−1q−2 . . . q−n), then the algorithm provides for computingS=ABDto yield a w-bit quotient Q and w-bit remainder R by: (1) determining the next quotient digit q−j using a quotient digit selection function; (2) generating the product q−jD; and (3) performing the triple addition of rRj−1, (−q−jD) andb-(j-1)(Ar)where R0=b−1Ar−1. The recurrence relation may be implemented with carry-save adders for computation using bitwise logical operators (AND, OR, XOR).
Owner:KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS

4-to-2 carry save adder using limited switching dynamic logic

A 4-to-2 carry save adder using limited switching dynamic logic (LSDL) to reduce power consumption while reducing the delay of outputting the sum and carry bits. The 4-to-2 carry save adder may include a first LSDL circuit configured to output a sum bit. The carry save adder may further include a second LSDL circuit configured to output a carry bit. Both the first and second LSDL circuits use a carry generated in the current stage that was previously generated in the previous stage (next lower order bit position). Since the carry is generated in the current stage and not in the previous stage, the delay in outputting the sum and carry bits is reduced and hence the performance of carry save adders is improved. Further, since LSDL circuits were used in the carry save adder, power consumption was reduced while using a small amount of area.
Owner:IBM CORP

Floating point computing unit

A Sweeney Robertson Tocher (SRT) divider and a square root extractor of floating point double-precision bit width, including a selector of single-precision and double-precision, a carry propagation adder (CPA) for conducting carry propagation of a partial remainder, a quotient digit selector circuit for making selection on a quotient digit, and a selector of a divisor or a partial square root extractor circuit, in a lower side thereof. A selector for selecting the propagation of carry between a carry save adder (CSA) in the upper side and the lower side thereof is provided, and a selector of a starting position within a quotient production circuit is provided, thereby enabling the execution of two (2) calculations, such as, division or square root extraction of the floating point single-precision, at the same time, but without increasing the bit width of a computing unit. Also, with the square root extraction, it is possible to execute two (2) calculations of single-precision in parallel, in a similar manner, by adding a partial square root extraction circuit thereinto.
Owner:RENESAS ELECTRONICS CORP

Field programmable gata array (FPGA)-based metric floating-point multiplier design

InactiveCN102073473ASave resourcesFix conversion precision issuesDigital data processing detailsImaging processingDensely packed decimal
The invention discloses a field programmable gata array (FPGA)-based metric floating-point multiplier design. The design adopts advanced and quick algorithms such as densely-packed decimal (DPD) coding, novel binary-coded decimal (BCD) coding, signed-digit radix-5, decimal 32:2 carry-save adder (CSA) and the like, is realized by programming through a Verilog hardware description language and can perform multiplication of 64-digit decimal floating-point numbers in accordance with the Institute of Electrical and Electronic Engineers (IEEE) 754-2008 new standard. The design effectively solves the problem of conversion accuracy existing in binary / decimal operation on the conventional hardware platform and the time problem of the realization of decimal floating-point multiplication by using software, consumes a small number of hardware resources and has high operation speed and a simple structure; moreover, according to the performance and characteristic of the FPGA, a system can be developed repeatedly, and a decimal floating-point unit which is accordant with the IEEE 754-2008 standard specification can be further developed and designed. The design is mainly applied to industries such as bank finance, image processing, medical treatment and the like.
Owner:YUNNAN UNIV

Modular multiplier

The invention discloses a modular (2<3n>-2<n>) multiplier, which comprises a 3n-bit binary multiplier, a 2n-bit CSA (Carry Save Adder) compressor array, a first 2n-bit binary adder, a one-bit phase inverter and a second 2n-bit binary adder. In the modular (2<3n>-2<n>) multiplier disclosed by the invention, the result P of binary multiplication is taken as an operand for reprocessing, and modulo addition operation is corrected in a way of adding 1 in advance, so that the operation speed is increased greatly. Compared with the prior art, the modular (2<3n>-2<n>) multiplier has the advantages that a multiplier and a combined logic circuit are reduced on resource cost; and on a key path, a multiplier is reduced.
Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA

Computing carry-in bit to most significant bit carry save adder in current stage

A 4-to-2 carry save adder with a reduction in the delay of outputting the sum and carry bits. The 4-to-2 carry save adder may include a lower order full order coupled to a higher order full adder. The carry save adder may further include a logic unit coupled to the higher order full adder where the logic unit is configured to generate a carry bit to be inputted to the higher order full adder that normally would be generated from the carry save adder located in the previous stage. By generating this carry bit (carry-in bit) in the current stage and not in the previous stage, the delay of the carry-in bit inputted to the higher order full adder is reduced thereby reducing the delay of outputting the sum and carry bits by the higher order full adder.
Owner:IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products