A 4-bit exact lookup table multiplier

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By optimizing the structure of the 4-bit exact lookup table multiplier and employing parallel computing and carry signal compression, the problems of latency and excessive area were solved, resulting in lower power consumption and higher hardware efficiency.

CN117591067BActive Publication Date: 2026-06-12YUNNAN UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: YUNNAN UNIV
Filing Date: 2023-11-24
Publication Date: 2026-06-12

Application Information

Patent Timeline

24 Nov 2023

Application

12 Jun 2026

Publication

CN117591067B

IPC: G06F7/523

CPC: G06F7/523; Y02D10/00

AI Tagging

Application Domain

Digital data processing details Energy efficient computing

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing 4-bit exact lookup table multipliers suffer from excessive delay and area, resulting in high power consumption and limiting their efficiency in integrated circuit applications.

⚗Method used

It adopts a structure of thirteen lookup tables and one carry chain, divided into a parallel first product calculation layer and a second product calculation layer, and optimizes the layout of the lookup tables through a carry signal compression layer to improve the utilization rate of the lookup tables.

🎯Benefits of technology

It achieves a shorter critical path and a smaller area, reduces computational latency and power consumption, and improves hardware efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN117591067B_ABST

Patent Text Reader

Abstract

The application discloses a 4-bit accurate look-up table multiplier, and relates to the technical field of accurate multipliers. A first look-up table, a second look-up table, a third look-up table and a fourth look-up table constitute a first product calculation layer, which is used for operating low two-bit bits in a multiplier and a multiplicand to obtain a first partial product signal set; a fifth look-up table, a sixth look-up table, a seventh look-up table and an eighth look-up table constitute a second product calculation layer, which is used for operating high two-bit bits in the multiplier and the multiplicand to obtain a second partial product signal set; a ninth look-up table, a tenth look-up table, an eleventh look-up table, a twelfth look-up table, a thirteenth look-up table and a carry chain constitute a carry signal compression layer; the carry signal compression layer is used for sequentially performing carry signal calculation and signal compression based on the first partial product signal set and the second partial product signal set to obtain a final product result. The application reduces the calculation delay and power consumption of the multiplier.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of precise multiplier technology, and in particular to a 4-bit precise lookup table multiplier. Background Technology

[0002] With advancements in manufacturing processes, integrated circuits (ICs) are becoming increasingly efficient and smaller. However, power consumption has gradually become a bottleneck restricting IC development. Therefore, reducing IC power consumption has become a crucial issue that needs to be addressed. Many current applications involve a large number of multiplication operations, such as machine learning, data processing, and multimedia, and the power consumption of multiplication accounts for a significant portion of the overall application power consumption. Therefore, reducing the power consumption of multipliers in applications can, to some extent, reduce the overall application power consumption, thereby improving application efficiency.

[0003] A lookup table multiplier is a type of multiplier based on an FPGA (Field Programmable Gate Array). For existing precise lookup table multipliers, the main factors limiting power consumption are latency and area. The latency of a precise lookup table multiplier is primarily affected by its critical path, while the area is mainly influenced by the number of lookup tables. Improving lookup table utilization and optimizing lookup table layout can shorten the critical path and reduce the number of lookup tables. Current precise 4-bit lookup table multipliers have low lookup table utilization and significant room for optimization in lookup table layout. These two factors lead to excessive latency and area, resulting in excessive power consumption and ultimately limiting the hardware efficiency of the multiplier in applications. Summary of the Invention

[0004] The purpose of this invention is to provide a 4-bit exact lookup table multiplier to reduce the computational latency and power consumption of the multiplier.

[0005] To achieve the above objectives, the present invention provides the following solution:

[0006] This invention provides a 4-bit exact lookup table multiplier, wherein both the multiplier and multiplicand of the 4-bit exact lookup table multiplier are four bits; the 4-bit exact lookup table multiplier includes thirteen lookup tables and one carry chain;

[0007] The first lookup table, the second lookup table, the third lookup table, and the fourth lookup table constitute the first product calculation layer. The first product calculation layer is used to perform operations on the multiplier and the lower two bits of the multiplicand to obtain the first partial product signal set.

[0008] The fifth lookup table, the sixth lookup table, the seventh lookup table, and the eighth lookup table constitute the second product calculation layer; the second product calculation layer is used to perform operations on the multiplier and the two most significant bits of the multiplicand to obtain the second partial product signal set; the first product calculation layer and the second product calculation layer are parallel operations;

[0009] The ninth lookup table, the tenth lookup table, the eleventh lookup table, the twelfth lookup table, the thirteenth lookup table, and the carry chain constitute the carry signal compression layer; the carry signal compression layer is used to perform carry signal calculation and signal compression sequentially based on the first partial product signal set and the second partial product signal set to obtain the final product result.

[0010] According to specific embodiments provided by the present invention, the present invention discloses the following technical effects:

[0011] This invention discloses a 4-bit precise lookup table multiplier. By setting up a first product calculation layer, a second product calculation layer, and a carry signal compression layer, and by having the first and second product calculation layers operate in parallel, a shorter critical path and lower latency can be achieved. This invention only requires thirteen lookup tables and one carry chain, compared to the 15 lookup tables used in existing multipliers. The multiplier has a smaller area, which means higher lookup table utilization and reduced computational power consumption. In summary, this invention optimizes the arrangement of lookup tables in the multiplier, improves lookup table utilization, thereby reducing multiplier latency, reducing multiplier area, and further reducing multiplier power consumption, ultimately improving the hardware efficiency of the multiplier. Attached Figure Description

[0012] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0013] Figure 1 A schematic diagram of a five-input single-output lookup table;

[0014] Figure 2 A schematic diagram of a six-input single-output lookup table;

[0015] Figure 3 A schematic diagram of a six-input, two-output lookup table;

[0016] Figure 4 This is a schematic diagram of the carry chain;

[0017] Figure 5This is a schematic diagram of the structure of the 4-bit exact lookup table multiplier of the present invention. Detailed Implementation

[0018] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0019] This invention provides a 4-bit precise lookup table multiplier, particularly a 4-bit precise multiplier based on a lookup table circuit, which solves the problems of excessive power consumption caused by excessively long critical paths and large area in existing multipliers.

[0020] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0021] The following is an introduction to the basic knowledge:

[0022] (a) Lookup table.

[0023] Taking the following three lookup tables—LUT5 (5-input-1-output Look-up Table), LUT6 (6-input-1-output Look-up Table), and LUT6_2 (6-input-2-output Look-up Table)—as examples, their structures are as follows: Figure 1 , Figure 2 , Figure 3 As shown.

[0024] For LUT5, if a logic function contains five binary inputs, the first step to implement the function on LUT5 is to convert all the outputs of the logic function corresponding to the different combinations of the five binary inputs into a set of 32-bit binary numbers, and then convert them into a set of 8-bit hexadecimal numbers. This gives us the INIT value, which is equivalent to storing all the input-output combinations corresponding to the logic function. LUT5 can find the required output result based on the INIT value and the corresponding input, and thus implement the logic function.

[0025] LUT6 and LUT6_2 are similar in that they are both composed of two LUT5s. The difference is that all six inputs of LUT6 can be variables, with only one output, while five of the six inputs of LUT6_2 can be variables, and the remaining one is fixed at 1, but it has two outputs. Therefore, for two functions with the same five or fewer inputs, a single LUT6_2 can be used, thereby improving the utilization of the lookup table and reducing the area of the lookup table multiplier.

[0026] (ii) Carry chain.

[0027] like Figure 4 As shown, a carry chain typically utilizes carry generation and carry transfer signals generated by a lookup table to implement the function of a carry-lookahead adder. Below, an example explains its working principle: the lookup table connected to the carry chain generates two outputs, where O6 corresponds to Prop (carry transfer signal) and O5 corresponds to Gen (carry generation signal). The Prop signal is mainly responsible for calculating the sum of the signals in this column, and the Gen signal is mainly responsible for calculating the carry of the signals in this column. The carry chain has one input signal C. in This signal is generally set to 0 by default. Taking the first part of the carry chain as an example, when Prop0 is 0, Gen0 is 1. The carry chain will choose to pass Gen0 to the next level, and then Prop0 will be linked with C. in Perform an XOR operation to calculate the sum S0 of the signals in this column; when Prop0 is 1, Gen0 is 0, and the carry chain will select C. in Pass it down to the next level, then Prop0 and C in Perform an XOR operation to calculate the sum S0 of the signals in this column.

[0028] like Figure 5 As shown, the present invention provides a 4-bit exact lookup table multiplier, wherein both the multiplier and the multiplicand of the 4-bit exact lookup table multiplier are four bits; the 4-bit exact lookup table multiplier includes thirteen lookup tables and one carry chain.

[0029] Specifically, both the multiplier and multiplicand range from 0 to 15. In application, they must first be converted to four-bit binary numbers: the four bits of the multiplier (four-bit binary number) are A3A2A1A0; and the four bits of the multiplicand (four-bit binary number) are B3B2B1B0. After this conversion, subsequent multiplication operations can be performed.

[0030] This invention divides the multiplication process into two stages: Stage 1 is the generation and initial compression of the partial product, and Stage 2 is the further compression of the compressed partial product to produce the final product result. Stage 1 divides the generation and compression of the partial product into two layers: a first product calculation layer and a second product calculation layer.

[0031] The first lookup table, second lookup table, third lookup table, and fourth lookup table constitute the first product calculation layer. This first product calculation layer performs operations on the multiplier and the least significant two bits of the multiplicand to obtain the first partial product signal set. That is, the first product calculation layer corresponds to (A3A2A1A0)×(B1B0), and the compressed partial product corresponds to L. 1i The value of i ranges from 1 to 5.

[0032] The fifth, sixth, seventh, and eighth lookup tables constitute the second product calculation layer. This second product calculation layer performs operations on the multiplier and the two most significant bits of the multiplicand to obtain a second partial product signal set. The first and second product calculation layers operate in parallel. That is, the second product calculation layer corresponds to (A3A2A1A0)×(B3B2), and the compressed partial product corresponds to L. 2i The value of i ranges from 1 to 5.

[0033] Furthermore, in practical applications, the 4-bit exact lookup table multiplier of this invention includes the following steps:

[0034] (1) The first lookup table is a six-input, two-output lookup table for partial product L. 11 and L 12 The operation. Calculate L. 11 The required inputs are A0, A1, B0, and B1. Calculate L. 12 The required inputs are A0, A1, A2, B0, and B1. Calculate L. 11 and L 12 A total of five inputs are required, so a LUT6_2 (six-input, two-output lookup table) is used to compute L simultaneously. 11 and L 12 .

[0035] The logical expression for the first lookup table is:

[0036]

[0037]

[0038] Among them, L 11 L 12 All are output signals of the first lookup table, and L 11 This is a single bit value of the product result.

[0039] L can be calculated using the above logical expression and lookup table. 11 and L 12 The value of L is calculated to save a lookup table. To facilitate differentiation between different lookup tables, L will be calculated. 11 and L12 The lookup table is named LUT1 (First Lookup Table). The inputs to this lookup table are A0, A1, A2, B0, and B1, and the output is L. 11 and L 12 .from Figure 5 It can be seen that L 11 The value of L is also equal to the value of P1, so L is output directly. 11 Let's call it P1.

[0040] (2) The second lookup table, LUT2, is a six-input, single-output lookup table. Calculate L... 13 The required inputs are A0, A1, A2, A3, B0, and B1, a total of six inputs are needed. Therefore, a LUT6 (six-input single-output lookup table) is used to compute L. 13 .

[0041] The logical expression for the second lookup table is:

[0042]

[0043] Among them, L 13 This is the output signal for the second lookup table.

[0044] (3) The third lookup table LUT3 is a six-input single-output lookup table. Calculate L... 14 The required inputs are A0, A1, A2, A3, B0, and B1, a total of six inputs are needed. Therefore, a LUT6 (six-input single-output lookup table) is used to compute L. 14 .

[0045] The logical expression for the third lookup table is:

[0046]

[0047] Among them, L 14 This is the output signal for the third lookup table.

[0048] (4) The fourth lookup table, LUT4, is a six-input, single-output lookup table. Calculate L... 15 The required inputs are A0, A1, A2, A3, B0, and B1, a total of six inputs are needed. Therefore, a LUT6 (six-input single-output lookup table) is used to compute L. 15 .

[0049] The logical expression for the fourth lookup table is:

[0050] L 15 =A3B0B1A2+A3B0B1A1A0.

[0051] Among them, L 15The output signal of the fourth lookup table; the first set of product signals includes L 11 L 12 L 13 L 14 L 15 .

[0052] (5) After completing the first stage of the first layer of operations, the second stage of the first layer of operations begins, i.e., the calculation of the second part product signal set begins. The fifth lookup table, LUT5, is a six-input, dual-output lookup table. Calculate L... 21 The required inputs are A0, A1, B2, and B3. Calculate L. 22 The required inputs are A0, A1, A2, B2, and B3. Therefore, calculating L... 21 and L 22 A total of five inputs are required, so a LUT6_2 (six-input, two-output lookup table) is used to compute L simultaneously. 21 and L 22 This saves one lookup table.

[0053] The logical expression for the fifth lookup table is:

[0054]

[0055]

[0056] Among them, L 21 L 22 All of these are output signals from the fifth lookup table.

[0057] (6) The sixth lookup table LUT6 is a six-input single-output lookup table. Calculate L... 23 The required inputs are A0, A1, A2, A3, B2, and B3, a total of six inputs are needed. Therefore, a LUT6 (six-input single-output lookup table) is used to compute L. 23 .

[0058] The logical expression for the sixth lookup table is:

[0059]

[0060] Among them, L 23 This is the output signal for the sixth lookup table.

[0061] (7) The seventh lookup table, LUT7, is a six-input, single-output lookup table. Calculate L... 24 The required inputs are A0, A1, A2, A3, B2, and B3, a total of six inputs are needed. Therefore, a LUT6 (six-input single-output lookup table) is used to compute L. 24 .

[0062] The logical expression for the seventh lookup table is:

[0063]

[0064] Among them, L 24 This is the output signal for the seventh lookup table.

[0065] (8) The eighth lookup table, LUT8, is a six-input, single-output lookup table. Calculate L... 25 The required inputs are A0, A1, A2, A3, B2, and B3, a total of six inputs are needed. Therefore, a LUT6 (six-input single-output lookup table) is used to compute L. 25 .

[0066] The logical expression for the eighth lookup table is:

[0067] L 25 (Gen3)=A3B2B3A2+A3B2B3A1A0.

[0068] Among them, L 25 The output signal of the eighth lookup table; the second part of the product set includes L 21 L 22 L 23 L 24 L 25 .

[0069] Through the above steps (1)-(8), after completing the calculation of stage 1, the partial product is compressed into two layers, the first layer being L. 1i The value of i ranges from 1 to 5. The second layer is L. 2i The value of i ranges from 1 to 5. Stage 2 further compresses the two partial products generated in Stage 1 to obtain the final product result. Specifically, the ninth lookup table, the tenth lookup table, the eleventh lookup table, the twelfth lookup table, the thirteenth lookup table, and the carry chain constitute the carry signal compression layer; the carry signal compression layer is used to perform carry signal calculation and signal compression sequentially based on the first partial product signal set and the second partial product signal set to obtain the final product result.

[0070] The specific structure of the carry signal compression layer is as follows:

[0071] The first output terminal of the first lookup table is connected to the first input terminal of the ninth lookup table, the first input terminal of the tenth lookup table, and the first input terminal of the thirteenth lookup table, respectively; the output terminal of the second lookup table is connected to the second input terminal of the tenth lookup table and the second input terminal of the thirteenth lookup table, respectively; the output terminal of the third lookup table is connected to the first input terminal of the eleventh lookup table; the output terminal of the fourth lookup table is connected to the first input terminal of the twelfth lookup table; the first output terminal of the fifth lookup table is connected to the third input terminal of the tenth lookup table and the third input terminal of the thirteenth lookup table, respectively; the second output terminal of the fifth lookup table is connected to the second input terminal of the eleventh lookup table; and the output terminal of the sixth lookup table is connected to the second input terminal of the twelfth lookup table.

[0072] The outputs of the seventh, eighth, tenth, eleventh, twelfth, and thirteenth lookup tables are all connected to the carry chain. The carry chain is used to compress the output signals of the seventh, eighth, tenth, eleventh, twelfth, and thirteenth lookup tables to obtain a five-bit value of the product result.

[0073] The output of the ninth lookup table is used to output the two-bit value of the product result; the second output of the first lookup table is used to output the one-bit value of the product result; the five-bit value, two-bit value and one-bit value of the product result constitute the final product result.

[0074] It should be noted that the first input, second input, first output, and second output terminals of each lookup table mentioned above are only used to illustrate the differences between input ports and output ports, and are not related to the port numbers in each lookup table. In practical applications, the connections between lookup tables are determined based on the bit values of each port's input or output.

[0075] Based on the above structure, the 4-bit exact lookup table multiplier of the present invention includes the following steps in practical applications:

[0076] (9) The ninth lookup table LUT9 is a six-input, two-output lookup table. The inputs required to calculate P0 are A0 and B0, and the input required to calculate P2 is L... 12 A0 and B2 are given as inputs. To compute P0 and P2, a total of five inputs are needed, so a LUT6_2 is used to compute P0 and P2.

[0077] The logical expression for the ninth lookup table is:

[0078] P0 = A0B0.

[0079]

[0080] P0 and P2 are both output signals of the ninth lookup table, which are the two-bit values of the product result.

[0081] (10) The tenth lookup table LUT 10 This is a six-input, two-output lookup table. The inputs required to compute Prop0 (Carry-PropagateSignal 0) and Gen0 (Carry-Generate Signal 0) are A0, B2, and L. 12 L 13 and L 21 There are a total of five inputs, so a LUT6_2 (six-input double-output lookup table) is used to compute Prop0 and Gen0.

[0082] When Prop0 is 1, the carry chain selection will be C. in (The carry chain input signal, usually 0 by default) is used as the carry signal to propagate to the next level. When Prop0 is 0, the carry chain selects Gen0 as the carry signal to propagate to the next level. When both Prop0 and Gen0 are 1, C will occur. in When both Gen0 and C need to be passed to the next level, the correct procedure is to pass Gen0 to the next level. However, in this case, the carry chain will choose to pass C. in The carry signal was passed to the next level, but Gen0 was missed, causing an error where the carry signal could not be passed to the next level. Therefore, Prop0 and Gen0 cannot both be 1 at the same time. And when A0, B2, L... 12 L 13 and L 21 When both are 1, a situation arises where Prop0 and Gen0 are both 1. To avoid this, this invention chooses to include L in the logical expression of Prop0. 13 L 21 L 12 Perform an XOR operation on A0B2, so that when A0, B2, and L... 12 L 13 and L 21 When both are 1, Prop0 changes from 1 to 0, while Gen0 remains 1. The carry chain can correctly transmit Gen0, ensuring the accurate transmission of the carry-generated signal. The logical expression for the tenth lookup table is:

[0083]

[0084] Gen0 = L 13 +L 21+L 12 A0B2.

[0085] Among them, L 12 The output signal for the first lookup table; Prop0 and Gen0 are both output signals for the tenth lookup table, specifically the corresponding carry transmission signal and carry generation signal; L 13 L is the output signal for the second lookup table. 21 This is the output signal for the fifth lookup table.

[0086] (11) The eleventh lookup table LUT 11 This is a six-input, two-output lookup table. The inputs required to compute Prop1 (Carry-Propagate Signal 1) and Gen1 (Carry-Generate Signal 1) are L. 14 and L 22 There are two inputs in total, so a LUT6_2 (six-input double-output lookup table) is used to compute Prop1.

[0087] The logical expression for the eleventh lookup table is:

[0088]

[0089] Gen1 = L 14 L 22 .

[0090] Among them, Prop1 and Gen1 are both output signals of the eleventh lookup table, specifically the corresponding carry transmission signal and carry generation signal.

[0091] (12) The twelfth lookup table LUT 12 This is a six-input, two-output lookup table. The input required to compute Prop2 (Carry-Propagate Signal 2) and Gen2 (Carry-Generate Signal 2) is L. 15 and L 23 There are two inputs in total, so a LUT6_2 (six-input double-output lookup table) is used to compute Prop2 and Gen2.

[0092] The logical expression for the twelfth lookup table is:

[0093]

[0094] Gen2 = L 15 L 23 .

[0095] Among them, Prop2 and Gen2 are both output signals of the twelfth lookup table, specifically the corresponding carry transmission signal and carry generation signal.

[0096] (13) The thirteenth lookup table LUT 13 This is a five-input lookup table. P3 is XORed by Prop0 and C. in Generates, under normal circumstances, C in A value of 0 will not affect the result of P3. However, when A0, B2, and L are equal, the result will not affect the result of P3. 12 L 13 and L 21 When both are 1, due to the operation of the tenth lookup table, Prop0 changes from 1 to 0, causing P3 to be calculated incorrectly. Therefore, C is required. in (Carry chain input signal) Corrects errors in Gen0. Let C... in equal to L 13 L 21 L 12 A0B2, instead of the default 0, A0, B2, L 12 L 13 and L 21 When both are 1, L 13 L 21 L 12 The value of A0B2 is 1. Then, XOR Prop0 with C. in Producing P3 allows it to be calculated correctly, thus correcting the error caused by Prop0. Calculate C. in (Carry chain input signal) Required inputs are A0, B2, and L. 12 L 13 and L 21 There are five inputs in total, so a LUT5 (five-input single-output lookup table) is used to compute C. in .

[0097] The logical expression for the thirteenth lookup table is:

[0098] C in =L 13 L 21 L 12 A0B2.

[0099] Among them, C in This is the output signal for the thirteenth lookup table.

[0100] (14) Because L 24 and L 25 Since they share six identical inputs, it's unnecessary to use L. 24 and L 25 Instead of calculating the carry transmission signal and carry generation signal required for the carry chain, L is directly set... 24and L 25 They are Prop3 (Carry-Propagate Signal 3) and Gen3 (Carry-Generate Signal 3), respectively.

[0101] C is passed through the carry chain. in Carry generation signals Gen0 and Gen 3＝1 Gen2, Gen3 and carry transmission signals Prop0, Gen1, Prop2, Prop3 are further compressed to obtain P3, P4, P5, P6 and P7.

[0102] Combining P7 to P1 yields the binary output of the 4-bit precise lookup table multiplier of this invention. Table 1 below is a summary table of the logical expressions of each lookup table in this invention.

[0103] Table 1

[0104]

[0105]

[0106] It should be noted that the accurate multiplier is coded in Verilog and implemented on Vivado 2019.1. Hardware metrics are Power, Latency, and Area (measured by the number of lookup tables). Table 2 below compares the hardware metrics of the multiplier of this invention with those of existing multipliers.

[0107] Table 3

[0108]

[0109] Compared with the prior art, the present invention has the following advantages:

[0110] (1) Low latency.

[0111] The critical path of the existing standard 4-bit exact lookup table multiplier consists of three lookup tables and one carry chain, while the critical path of the 4-bit exact lookup table multiplier of this invention consists of a first product calculation layer, a second product calculation layer, and one carry chain. The shorter critical path results in lower latency.

[0112] Existing exact lookup table multipliers compress partial products in stage 1, but 4-bit multiplication involves 8-bit inputs, while LUT6 can only have a maximum of six inputs. Therefore, two LUT6s need to be cascaded for computation, resulting in two cascaded LUT6s on the critical path of the first layer. This invention divides stage 1 into two parallel computation layers, each with six-bit inputs, which conforms to the structure of LUT6. This allows for accurate calculation of the sum and carry of each layer, and both layers are computed in parallel. Thus, the first stage occupies only one lookup table on the critical path of the overall multiplier.

[0113] (2) Small area.

[0114] Existing 4-bit exact lookup table multipliers have an area of 15 LUTs, but the 4-bit exact lookup table multiplier of this invention has an area of only 13 LUTs. The smaller area means higher LUT utilization. This advantage stems from the fact that for two functions with the same five or fewer inputs, this invention uses a single LUT6_2 to implement both functions simultaneously, thereby improving the utilization of the lookup table and reducing the multiplier area.

[0115] Therefore, combining the above two advantages, the 4-bit precise lookup table multiplier of this invention consumes 37.09% less power than existing multipliers, thus achieving higher hardware efficiency in applications.

[0116] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the systems disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the descriptions are relatively simple; relevant parts can be referred to the method section.

[0117] This document uses specific examples to illustrate the principles and implementation methods of the present invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of the present invention. Furthermore, those skilled in the art will recognize that, based on the ideas of the present invention, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of the present invention.

Claims

1. A 4-bit exact lookup table multiplier, characterized in that, The multiplier and multiplicand of the 4-bit exact lookup table multiplier are both four bits; the 4-bit exact lookup table multiplier includes thirteen lookup tables and one carry chain; The first lookup table, the second lookup table, the third lookup table, and the fourth lookup table constitute the first product calculation layer. The first product calculation layer is used to perform operations on the multiplier and the lower two bits of the multiplicand to obtain the first partial product signal set. The fifth lookup table, the sixth lookup table, the seventh lookup table, and the eighth lookup table constitute the second product calculation layer; the second product calculation layer is used to perform operations on the multiplier and the two most significant bits of the multiplicand to obtain the second partial product signal set; the first product calculation layer and the second product calculation layer are parallel operations; The ninth lookup table, the tenth lookup table, the eleventh lookup table, the twelfth lookup table, the thirteenth lookup table, and the carry chain constitute a carry signal compression layer; the carry signal compression layer is used to perform carry signal calculation and signal compression sequentially based on the first partial product signal set and the second partial product signal set to obtain the final product result; The specific structure of the carry signal compression layer is as follows: The first output terminal of the first lookup table is connected to the first input terminal of the ninth lookup table, the first input terminal of the tenth lookup table, and the first input terminal of the thirteenth lookup table, respectively; the output terminal of the second lookup table is connected to the second input terminal of the tenth lookup table and the second input terminal of the thirteenth lookup table, respectively; the output terminal of the third lookup table is connected to the first input terminal of the eleventh lookup table; the output terminal of the fourth lookup table is connected to the first input terminal of the twelfth lookup table; the first output terminal of the fifth lookup table is connected to the third input terminal of the tenth lookup table and the third input terminal of the thirteenth lookup table, respectively; the second output terminal of the fifth lookup table is connected to the second input terminal of the eleventh lookup table; and the output terminal of the sixth lookup table is connected to the second input terminal of the twelfth lookup table. The output terminals of the seventh lookup table, the eighth lookup table, the tenth lookup table, the eleventh lookup table, the twelfth lookup table, and the thirteenth lookup table are all connected to the carry chain; the carry chain is used to compress the output signals of the seventh lookup table, the eighth lookup table, the tenth lookup table, the eleventh lookup table, the twelfth lookup table, and the thirteenth lookup table to obtain a five-bit value of the product result; The output of the ninth lookup table is used to output the two-bit value of the product result; the second output of the first lookup table is used to output the one-bit value of the product result; the five-bit value, two-bit value and one-bit value of the product result constitute the final product result.

2. The 4-bit exact lookup table multiplier according to claim 1, characterized in that, The four bits of the multiplier are A3A2A1A0; the four bits of the multiplicand are B3B2B1B0; The logical expression for the first lookup table is: ；； The logical expression for the second lookup table is: ； The logical expression for the third lookup table is: ； The logical expression for the fourth lookup table is: ； in, , All of these are output signals from the first lookup table, and This is a single bit value of the product result; This is the output signal for the second lookup table; This is the output signal for the third lookup table; The output signal of the fourth lookup table; the first set of product signals includes , , , , .

3. The 4-bit exact lookup table multiplier according to claim 1, characterized in that, The four bits of the multiplier are A3A2A1A0; the four bits of the multiplicand are B3B2B1B0; The logical expression for the fifth lookup table is: ；； The logical expression for the sixth lookup table is: ； The logical expression for the seventh lookup table is: ； The logical expression for the eighth lookup table is: ； in, , All of these are output signals from the fifth lookup table. This is the output signal for the sixth lookup table; This is the output signal for the seventh lookup table; The output signal of the eighth lookup table; the second part of the product signal set includes , , , , .

4. The 4-bit exact lookup table multiplier according to claim 1, characterized in that, The four bits of the multiplier are A3A2A1A0; the four bits of the multiplicand are B3B2B1B0; the first partial product signal set includes , The second part of the product signal set includes ; The logical expression for the ninth lookup table is: ；； The logical expression for the tenth lookup table is: ；； in, , All of these are the output signals of the ninth lookup table, i.e., the two-bit values of the product result; This is the output signal for the first lookup table; , All of these are output signals of the tenth lookup table, specifically the corresponding carry transmission signal and carry generation signal; This is the output signal for the second lookup table; This is the output signal for the fifth lookup table.

5. The 4-bit exact lookup table multiplier according to claim 1, characterized in that, The four bits of the multiplier are A3A2A1A0; the four bits of the multiplicand are B3B2B1B0; the first partial product signal set includes , , The second part of the product signal set includes , , ; The logical expression for the eleventh lookup table is: ；； The logical expression for the twelfth lookup table is: ；； The logical expression for the thirteenth lookup table is: ； in, , These are all output signals of the eleventh lookup table, specifically the corresponding carry transmission signal and carry generation signal; This is the output signal of the third lookup table. This is the output signal for the fifth lookup table; , These are all output signals from the twelfth lookup table, specifically the corresponding carry transmission signal and carry generation signal; This is the output signal of the fourth lookup table. This is the output signal for the sixth lookup table; This is the output signal for the thirteenth lookup table. This is the output signal for the second lookup table. This is the output signal of the fifth lookup table. This is the output signal for the first lookup table.

6. The 4-bit exact lookup table multiplier according to claim 1, characterized in that, The first lookup table is a six-input, dual-output lookup table; The second lookup table is a six-input single-output lookup table; The third lookup table is a six-input single-output lookup table; The fourth lookup table is a six-input single-output lookup table; The fifth lookup table is a six-input, dual-output lookup table; The sixth lookup table is a six-input single-output lookup table; The seventh lookup table is a six-input single-output lookup table; The eighth lookup table is a six-input single-output lookup table; The ninth lookup table is a six-input, dual-output lookup table; The tenth lookup table is a six-input, dual-output lookup table; The eleventh lookup table is a six-input, dual-output lookup table; The twelfth lookup table is a six-input, two-output lookup table; The thirteenth lookup table is a five-input, single-output lookup table.