Converters, chips, electronic devices and methods for converting data types
By introducing a first-stage and a second-stage data type converter into the AI chip, and leveraging the universality of the intermediate results, the problems of low data type conversion efficiency and poor scalability in traditional AI chips are solved, achieving more efficient data conversion and lower circuit complexity and power consumption.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ANHUI CAMBRICON INFORMATION TECH CO LTD
- Filing Date
- 2019-10-25
- Publication Date
- 2026-06-23
Smart Images

Figure CN112711440B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of data processing technology, and more specifically, to the conversion of data types. Background Technology
[0002] Traditional arithmetic units typically only handle fixed-precision floating-point and integer conversions, offering limited functionality. However, in artificial intelligence (AI) chips, the number of data type conversion instructions executed far exceeds that of traditional processing units, and programmers' demands for conversion capabilities have increased significantly. Consequently, the increased volume of software computations exacerbates the weaknesses of software-based data type conversion—low efficiency, high memory access overhead, and high power consumption—making its processing speed a performance bottleneck for the entire processor core.
[0003] Meanwhile, traditional arithmetic units implemented through instructions are all single-function. If the processor core needs to implement new data type conversion functions, it is necessary to add logical expressions according to the multiplication principle based on the new functions. Its scalability is poor: once new functional requirements appear, the area of the arithmetic units in the chip will increase exponentially, and there will be a lot of repetitive calculation logic, which will affect the overall performance of the processor.
[0004] For example, when there are M types of input data and N types of output data, there are usually M*N data conversion paths required. Therefore, the corresponding circuit design will be relatively complex and power consumption will be high. Moreover, whenever a new data type appears, the converter needs to be redesigned, which increases the workload and reduces production efficiency.
[0005] Therefore, traditional methods for data type conversion are ineffective in AI chip applications, and we cannot refer to traditional implementation methods to implement the computing units in AI chips. Summary of the Invention
[0006] One objective of this disclosure is to overcome the shortcomings of low data conversion efficiency and poor scalability in the prior art.
[0007] According to a first aspect of this disclosure, a converter for converting data types is provided, comprising: a first conversion stage L1 configured to receive first type data and first description information about the first type data, and convert the first type data into an intermediate result according to the first description information; and a second conversion stage L2 configured to receive second description information about second type data, and convert the intermediate result into second type data according to the second description information.
[0008] According to a second aspect of this disclosure, a chip is provided that includes the converter described above.
[0009] According to a third aspect of this disclosure, an electronic device is provided, which includes the chip described above.
[0010] According to a fourth aspect of this disclosure, a method for converting data types is provided, comprising: receiving first type data and first description information about the first type data, and converting the first type data into an intermediate result according to the first description information; and receiving second description information about second type data, and converting the intermediate result into second type data according to the second description information.
[0011] According to a fifth aspect of this disclosure, an electronic device is provided, comprising: one or more processors; and a memory storing computer-executable instructions that, when executed by the one or more processors, cause the electronic device to perform the method described above.
[0012] According to a sixth aspect of this disclosure, a computer-readable storage medium is provided, including computer-executable instructions that, when executed by one or more processors, perform the method described above.
[0013] At least one beneficial effect of the technical solution provided in this disclosure is that it can improve the efficiency of data type conversion in AI chips, reduce the computational burden, and reduce the required circuit area. Attached Figure Description
[0014] The above and other objects, features, and advantages of this disclosure will become readily apparent from the following detailed description of exemplary embodiments with reference to the accompanying drawings. In the drawings, several embodiments of this disclosure are illustrated by way of example and not limitation, and like or corresponding reference numerals denote like or corresponding parts, wherein:
[0015] Figure 1 A converter for converting data types according to the first aspect of this disclosure is shown.
[0016] Figure 2 A flowchart of a method for converting data types according to another aspect of this disclosure is shown.
[0017] Figure 3 A schematic block diagram of a converter according to one embodiment of the present disclosure is shown.
[0018] Figure 4 A schematic block diagram of a first conversion stage L1 according to one embodiment of the present disclosure is shown.
[0019] Figure 5 A schematic block diagram of a first extraction unit E1 according to one embodiment of the present disclosure is shown.
[0020] Figure 6 A schematic block diagram of a second conversion stage L2 according to one embodiment of the present disclosure is shown.
[0021] Figure 7a A schematic block diagram of a second computing unit C2 according to another embodiment of the present disclosure is shown.
[0022] Figure 7b A schematic block diagram of a second computing unit C2 according to one embodiment of the present disclosure is shown.
[0023] Figure 8a A schematic block diagram of an absolute value calculation circuit C21 according to one embodiment of the present disclosure is shown.
[0024] Figure 8b A schematic block diagram of an absolute value calculation circuit C21 according to another embodiment of the present disclosure is shown.
[0025] Figure 9a A schematic block diagram of a second pre-output calculation unit P2 according to one embodiment of the present disclosure is shown.
[0026] Figure 9b A schematic block diagram of a second front-end output calculation unit P2 according to another embodiment of the present disclosure is shown.
[0027] Figure 10 A schematic diagram of the structure of a second data recovery unit R2 according to one embodiment of the present disclosure is shown.
[0028] Figure 11a A schematic block diagram of a pre-output processing circuit R21 according to one embodiment of the present disclosure is shown.
[0029] Figure 11b A schematic block diagram of R21 of a pre-output processing circuit according to another embodiment of the present disclosure is shown.
[0030] Figure 12 This is a structural diagram illustrating a combined processing apparatus according to an embodiment of the present disclosure.
[0031] Figure 13 This is a schematic diagram illustrating the structure of a circuit board according to an embodiment of the present disclosure. Detailed Implementation
[0032] The technical solutions in the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this disclosure, not all of them. Based on the embodiments in this disclosure, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this disclosure.
[0033] It should be understood that the terms "first," "second," "third," and "fourth," etc., in the claims, specification, and drawings of this disclosure are used to distinguish different objects, not to describe a specific order. The terms "comprising" and "including" as used in the specification and claims of this disclosure indicate the presence of the described features, integrals, steps, operations, elements, and / or components, but do not exclude the presence or addition of one or more other features, integrals, steps, operations, elements, components, and / or collections thereof.
[0034] It should also be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of this disclosure. As used in this disclosure and claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used in this disclosure and claims refers to any combination and all possible combinations of one or more of the associated listed items, and includes such combinations.
[0035] As used in this specification and claims, the term "if" may be interpreted, depending on the context, as "when," "once," "in response to determination," or "in response to detection." Similarly, the phrase "if determined" or "if [described condition or event] is detected" may be interpreted, depending on the context, as "once determined," "in response to determination," "once [described condition or event] is detected," or "in response to detection of [described condition or event]."
[0036] Figure 1 A converter for converting data types according to the first aspect of this disclosure is shown. Figure 2 A flowchart of a method for converting data types according to another aspect of this disclosure is shown.
[0037] like Figure 1As shown, the converter includes: a first conversion stage L1 configured to receive first type data and first description information about the first type data, and convert the first type data into an intermediate result according to the first description information; and a second conversion stage L2 configured to receive second description information about second type data, and convert the intermediate result into second type data according to the second description information.
[0038] like Figure 2 As shown, the method disclosed herein may include: a first operation S1, receiving first type data and first description information about the first type data, and converting the first type data into an intermediate result according to the first description information; and a second operation S2, receiving second description information about second type data, and converting the intermediate result into second type data according to the second description information.
[0039] It needs to be understood that, despite Figure 2 The diagram illustrates two operations, S1 and S2, but the steps within operations S1 and S2 are not necessarily executed sequentially; they can also be executed simultaneously. For example, receiving second descriptive information about the second type of data in operation S2 can occur before, simultaneously with, or after the first operation.
[0040] In this disclosure, when converting data types, an intermediate result is first generated that is applicable to all data types. This intermediate result can effectively represent the data being converted (the first type of data mentioned above) and can be converted to any desired type of data (the second type of data mentioned above). In other words, this intermediate result has common content and / or structure with respect to all types of data, and thus can be used to convert to other data types.
[0041] The beneficial effects of converting the first type of data into an intermediate result, and then converting the intermediate result into the second type of data, include, but are not limited to: In traditional hardware structures, if there are M types of input data and N types of output data, a separate circuit needs to be designed for each conversion, resulting in a circuit complexity of approximately M*N. This significantly increases the workload of circuit design, increases circuit area, and further leads to adverse effects such as increased power consumption and cost. In contrast, the technical solution provided in this disclosure, with the same number of data type conversions, has a circuit complexity of only approximately M+N. This greatly reduces the complexity of circuit design, reduces circuit area, and thus reduces circuit power consumption and saves costs.
[0042] Figure 3 A schematic block diagram of a converter according to one embodiment of the present disclosure is shown.
[0043] like Figure 3 As shown, the converter of this disclosure further includes a memory configured to store the intermediate results.
[0044] Since the intermediate result is generated based on the first type of data and is independent of the data type of the second type of data, the intermediate result generated based on the first type of data can be stored in advance. Regardless of the type of the second type of data, the second type of data can be obtained from the stored intermediate data. Therefore, it is not necessary to convert the first type of data in each conversion, which saves redundant calculations in the chip and has beneficial effects on reducing power consumption, reducing circuit area, and saving costs.
[0045] Figure 4 A schematic block diagram of a first conversion stage L1 according to one embodiment of the present disclosure is shown.
[0046] like Figure 4 As shown, the first conversion stage L1 includes a first receiving unit Rx1 and a first extraction unit E1. The first description information includes a first data type of the first type of data and a first exponent bit of the first type of data. The first receiving unit Rx1 is configured to receive the first type of data and the first description information; the first extraction unit E1 is configured to extract the intermediate sign bit Msign, the intermediate data bit Mdata, and the intermediate exponent bit Mshift from the first type of data and the first description information as the intermediate result.
[0047] Data types can be varied, including but not limited to FIX4, FIX8, FIX16, FIX32, UFIX8, UFIX16, UFIX32, FP16, FP32, BFLOAT, and any other existing or custom data types. It's important to understand that this example only uses up to 32 bits; for 64-bit or higher, a much larger number of data types can be included.
[0048] The middle sign bit Msign indicates the sign of the first type of data. For example, when the sign bit is 0, the data is non-negative, and when the sign bit is 1, the data is negative. The middle data bit Mdata indicates the actual valid data, and the middle exponent bit Mshift indicates the shift value of the data.
[0049] Figure 5 A schematic block diagram of a first extraction unit E1 according to one embodiment of the present disclosure is shown.
[0050] like Figure 5As shown, the first extraction unit E1 may include: a sign bit calculation circuit E11, a valid bit calculation circuit E12, and an intermediate exponent bit calculation circuit E13; the sign bit calculation circuit E11 may be configured to extract the sign of the first type of data from the first type of data to serve as the intermediate sign bit Msign; the valid bit calculation circuit E12 may be configured to extract the valid data bits of the first type of data from the first type of data to serve as the intermediate data bit Mdata; and the intermediate exponent bit calculation circuit E13 may be configured to obtain the exponent information of the first type of data based on the first type of data or the first exponent bit to serve as the intermediate exponent bit Mshift.
[0051] It is important to understand that the expressions above, such as "as the intermediate sign bit Msign", "as the intermediate data bit Mdata", and "as the intermediate exponent bit Mshift", can indicate that the output of the corresponding circuit is the same as the intermediate result, but more preferably, they indicate that they are equivalent. It is important to understand that the term "equivalent" here indicates a substantial similarity, but it can be formally different. For example, for an 8-bit number 0000 0001, when it is transformed into 0000 000000000 0001, it is essentially another representation of the previous 8 bits, but it may not be completely equal. Furthermore, it is important to understand that in addition to changes in the number of bits, different forms of representation of a number, such as two's complement, offset binary, binary, decimal, and hexadecimal, are also within the scope of "equivalent" as described herein. In other words, as long as the valid information is not lost, any form of change can be considered equivalent.
[0052] Although the description information and data are explained as two different message carriers above, it is important to understand that in practice, there may not be a clear boundary between them. For example, when the first type of data is of type Fix, the shift value or exponent of the first type of data can be specified in the separate description information, and the intermediate exponent Mshift can be obtained based on the shift value. However, when the first type of data is, for example, of type Float, the Float type data itself contains the shift value, so the first extraction unit E1 can directly extract the first shift value from the first type of data. Therefore, the first type of data and its first description information can be mixed together or separate. Thus, in Figure 5 In the first extraction unit E1 shown, the intermediate exponent calculation circuit E13 can extract exponent information from the first description information (e.g., when the first type of data is of type Fix) or from the first type of data (e.g., when the first type of data is of type Float).
[0053] It is important to understand that the term "first type data" in the above description can refer to the original first type data, or it can refer to the first type data after transformation, splicing, or splitting. In other words, the transformation of first type data at each stage is also included within the scope of first type data.
[0054] The number of bits of the first type of data mentioned above can be various, such as 1 bit, 2 bits, 4 bits, 8 bits, 16 bits, 32 bits, etc. In this disclosure, the processing bit width of the converter used (e.g., register, memory, bus bit width), etc., may be other bit widths, such as 32 bits. Therefore, according to one embodiment of this disclosure, the first processing level L1 is further configured to determine the number of received first type of data and concatenate the number of first type of data to form first concatenated data. The first processing level L1 converts the first concatenated data into an intermediate result according to the first description information.
[0055] According to one embodiment of this disclosure, the amount of first type data received can be determined by dividing the processing bit depth of the converter by the bit depth of the first type of data.
[0056] For example, when the input data is 8 bits and the converter's processing bit width (e.g., the register's bit width) is 32 bits, it can receive 4 input data at the same time, that is, it can concatenate the 4 input data to form 32-bit data.
[0057] According to another embodiment of this disclosure, the quantity of the first type of data to be spliced can be determined by a preset first fixed value.
[0058] Taking two 8-bit hexadecimal numbers 81 and 82 as an example, four data points can be received at once, or two data points can be received at once. In this embodiment, the binary representations of the hexadecimal numbers 81 and 82 are "1000 0001" and "1000 0010" respectively, which can be extended into two 16-bit numbers, namely "xxxx xxxx 1000 0001" and "yyyy yyyy1000 0010". The actual 8-bit data is placed in the lower eight bits of the 16-bit number, while the higher bits of the 16-bit number are padded with zeros or other specified numbers (represented by x here). The concatenated data can be 00008182, which is represented in binary as "xxxx xxxxyyyy yyyy 1000 0001 1000 0010". That is, in the 32-bit concatenated data, the first input data "81" occupies the lower 8 bits (0-7), while the second input data "82" occupies the middle 8 bits (8-15). The higher bits (16-31) of the 32-bit data are padded with x and y, where x and y are set according to the actual situation and can be the same or different. This will be explained in detail below.
[0059] It's important to understand that the above concatenation method is merely an example. Those skilled in the art can set the concatenated data format to their own needs. For instance, the first received data can be placed in the lower 16 bits of the 32-bit concatenated data, while the second received data can be placed in the higher 16 bits. Using the hexadecimal numbers 81 and 82 as an example again, the concatenated data format could also be, for example, xxxx xxxx 1000 0001yyyy yyyy1000 0010, where x and y can be the same or different.
[0060] Of course, those skilled in the art will understand that the above-described data concatenation is not mandatory, but merely a preferred method. For example, other prescribed formats can also be used (e.g., using a method of marking valid bits, i.e., pre-defining which bits are valid and which are invalid). According to another implementation, concatenation can be omitted, and the number of bits in the input data can simply be extended. For example, in the case of 8-bit input data, the 8-bit input data can be directly extended to 32-bit data (e.g., by adding 0s to specific bits of the original 8-bit input data).
[0061] The above describes the case where the number of bits of the first type of data is shorter than the number of bits in the register. In another case, if the number of bits of the input data is greater than the number of bits the converter can process, for example, if the input data is 64 bits and the number of bits the converter can process is 32 bits, then the following processing can be performed.
[0062] One approach is to truncate the 64-bit data, keeping only the required 32 bits and discarding the other 32 bits, then process the remaining 32 bits. This approach may result in some data loss and errors.
[0063] According to another embodiment of this disclosure, the first processing level L1 is further configured to determine the number of received first type data to be split, and split the first type data into the number of split data, wherein the first processing level L1 converts the split data into intermediate results according to the first description information.
[0064] In this implementation, 64-bit data can be split into two 32-bit data, and the two split 32-bit data can be processed to generate intermediate results.
[0065] According to one embodiment of this disclosure, the number of received first-type data to be split can be determined by dividing the first-type data by the number of bits processed by the converter.
[0066] According to another embodiment of this disclosure, the data can be divided by a preset second fixed value, for example, the fixed value can be set to 2 or other numbers.
[0067] Splitting and concatenating data helps avoid or reduce the need for additional timing control design in circuits; furthermore, this implementation method facilitates parallel data processing, improving resource utilization and data throughput.
[0068] The corresponding splitting and splicing functions can be added to the first conversion level L1 mentioned above. These functions can be implemented in software and / or hardware.
[0069] As can be seen, this disclosure does not limit the number of bits of inputs, outputs, and converters (e.g., registers). By splitting and concatenating data, this disclosure can process data of any number of bits.
[0070] The function and structure of the second conversion stage L2 are described in detail below.
[0071] Figure 6 A schematic block diagram of a second conversion stage L2 according to one embodiment of the present disclosure is shown.
[0072] like Figure 6As shown, the second conversion stage L2 may include a second calculation unit C2, a second pre-output parsing unit P2, and a second data recovery unit R2; the second calculation unit C2 is configured to receive intermediate results and second description information, and calculate a second intermediate result based on the intermediate results and the second description information; the second pre-output parsing unit P2 is configured to calculate the pre-output data bit Pdata and the pre-output sign bit Psign based on the second intermediate result; and the second data recovery unit R2 is configured to generate second type data based on the pre-output data bit Pdata and the pre-output sign bit Psign.
[0073] The second computing unit C2 can receive intermediate results from the first conversion stage L1 or from memory, and further receive second description information describing the second type of data.
[0074] The second descriptive information is similar to the first descriptive information and may include information such as the data type of the second type of data. For example, the data type of the second type of data may include, but is not limited to, FIX4, FIX8, FIX16, FIX32, UFIX8, UFIX16, UFIX32, FP16, FP32, BFLOAT, and any other existing or custom data type. The second type of data may also include shift values of the second type of data, etc. The second descriptive information can be manually input or input into the second computing unit C2 via a file or signal.
[0075] According to one embodiment of this disclosure, the first description information and / or the second description information may further include a rounding type, which includes at least one of the following: TO_ZERO, OFF_ZERO, UP, DOWN, ROUNDING_OFF_ZERO, ROUNDING_TO_EVEN, and random rounding.
[0076] TO_ZERO indicates rounding towards zero; in other words, it indicates rounding towards the smaller absolute value.
[0077] OFF_ZERO indicates rounding away from zero; in other words, rounding towards the direction with the largest absolute value.
[0078] UP indicates rounding to positive infinity;
[0079] DOWN indicates rounding to negative infinity;
[0080] ROUNDING_OFF_ZERO indicates rounding;
[0081] ROUNDING_TO_EVEN means that, based on rounding, exactly half of the value is taken as an even number.
[0082] It is important to understand immediately that the rounding types mentioned above are merely examples, and those skilled in the art can set various desired rounding methods.
[0083] Figure 7a A schematic block diagram of a second computing unit C2 according to another embodiment of the present disclosure is shown.
[0084] like Figure 7a As shown, according to one embodiment of this disclosure, the second description information may include a second data type of the second type of data and a second exponent Sshift of the second type of data. The second calculation unit C2 may include: an absolute value calculation circuit C21 configured to calculate a second intermediate data bit ABS based on the intermediate data bit Mdata; a sign bit calculation circuit C22 configured to calculate a second intermediate sign bit Sign based on the intermediate sign bit Msign; and a differential exponent calculation circuit C23 configured to calculate the differential exponent Dshift between the intermediate exponent Mshift and the second exponent Sshift as the second intermediate number EXP.
[0085] It should be noted that the "difference" mentioned above indicates not only the magnitude of the shift but also the direction of the shift. The difference described in this disclosure can be the first exponent minus the second exponent, or the second exponent minus the first exponent. This is clear to those skilled in the art, and therefore will not be elaborated upon here.
[0086] Figure 7b A schematic block diagram of a second computing unit C2 according to one embodiment of the present disclosure is shown.
[0087] like Figure 7b As shown, according to one embodiment of this disclosure, the second calculation unit (C2) may further include: a rounding bit calculation circuit C24, configured to calculate a second intermediate rounding bit STK based on the second intermediate data bit ABS and the second intermediate sign bit Sign.
[0088] According to another embodiment of this disclosure, the second calculation unit C2 may further include: a rounding bit calculation circuit C24, configured to calculate a second intermediate rounding bit STK based on the second intermediate data bit (ABS), the second intermediate exponent bit EXP, and the second intermediate sign bit Sign.
[0089] In the two implementations of calculating the second intermediate rounding bit STK above, the differential exponent can be used or not. For example, when the second intermediate rounding bit STK is in the form of an array (for example, all rounding content needs to be retained), the intermediate sign bit exponent EXP can be omitted; while if the intermediate rounding bit needs to specify a certain one or several bits, the intermediate sign bit exponent EXP can be used.
[0090] According to one embodiment of this disclosure, the rounding bit calculation circuit C24 can be implemented using AND-OR logic. For example, STK = ABS for rounding to the nearest integer, and STK[n] = |ABS[n:x1]&&~SIGN for rounding to positive infinity, etc.
[0091] like Figure 7a As shown, through the converter and method described above, all intermediate results can be converted into a second intermediate result with the same content. That is, according to one embodiment of this disclosure, the second intermediate result may include a second intermediate sign bit (Sign), a second intermediate exponent bit (EXP), and a second intermediate data bit (ABS).
[0092] like Figure 7b As shown, according to another embodiment of this disclosure, the second intermediate result may include a second intermediate sign bit Sign, a second intermediate exponent bit EXP, a second intermediate data bit ABS, and a second intermediate rounding bit STK.
[0093] Figure 7a and Figure 7b The rounding bit calculation circuit C24 can also be set in the second pre-output parsing unit P2. That is, the second pre-output parsing unit P2 can receive the second intermediate result including the second intermediate sign bit Sign, the second intermediate exponent bit EXP and the second intermediate data bit ABS, and calculate the second intermediate rounding bit STK based on the second intermediate result.
[0094] Figure 8a A schematic block diagram of an absolute value calculation circuit C21 according to one embodiment of the present disclosure is shown.
[0095] like Figure 8aAs shown, the absolute value calculation circuit C21 may include: a second selector configured to determine whether the intermediate data bit Mdata is less than zero; and a first two's complement calculator configured to calculate the two's complement of the intermediate data bit Mdata as the second intermediate data bit ABS if the intermediate data bit Mdata is less than zero; otherwise, use the intermediate data bit Mdata as the intermediate data bit ABS. Finding the two's complement is actually inverting all bits except the sign bit and adding 1. Therefore, the first two's complement calculator may include a first inverter and a first adder. If the intermediate data bit is greater than or equal to zero (not negative), then the intermediate data bit ABS is equivalent to the intermediate data bit Mdata.
[0096] Figure 8b A schematic block diagram of an absolute value calculation circuit C21 according to another embodiment of the present disclosure is shown.
[0097] like Figure 8b As shown, the absolute value calculation circuit C21 may further include a first selector and a first normalizer, wherein the first selector is configured to determine whether the data type of the intermediate data bit Mdata is type 1 or type 2.
[0098] The first type mentioned above could be, for example, a Fix type, and the second type could be, for example, a Float type. In the following description and accompanying figures, Fix will be used as an example of the first type, and Float as an example of the second type. It should be understood that the first and second types of data can also be any other suitable data type.
[0099] If the data type of the intermediate data bit Mdata is Fix, then the second selector is selected for processing. It is determined whether the transition data bit Mdata is less than zero. If the transition data bit Mdata is less than zero (negative), then the two's complement of Mdata is calculated in the first two's complement calculator and used as the intermediate data bit ABS. Calculating the two's complement involves inverting all bits except the sign bit and adding 1. Therefore, the first two's complement calculator may include a first inverter and a first adder. If the transition data bit Mdata is greater than or equal to zero (not negative), then the intermediate data bit ABS is equivalent to the transition data bit Mdata.
[0100] If the data type of the intermediate data bit Mdata is Float, then the first normalizer is selected for processing. The first normalizer is configured to normalize the intermediate data bit Mdata when the data type of the intermediate data bit Mdata is Float, so as to use it as the second intermediate data bit ABS.
[0101] Normalization is an operation performed on Float type numbers. The IEEE 754 standard defines Float type numbers as normalized numbers, denormalized numbers, zero, positive and negative infinity, and NOT numbers. In this operation, all normalized numbers are padded with 1s to the front, and denormalized numbers are padded with 0s to the back, resulting in the actual original code representation of the number. This result has one more bit than the normalized / denormalized representation of the Float type.
[0102] Further as Figure 7a and Figure 7b As shown, the differential exponent calculation circuit C23 is configured to calculate the differential exponent Dshift based on the intermediate exponent Mshift and the second exponent Sshift, and then obtain the second intermediate exponent EXP. According to one embodiment of this disclosure, the second intermediate exponent EXP is equivalent to the differential exponent Dshift.
[0103] Further as Figure 7a and Figure 7b As shown, the sign bit calculation circuit C22 can be configured to calculate the second intermediate sign bit Sign based on the intermediate sign bit Msign. It should be understood that the sign does not change; therefore, the second intermediate sign bit Sign can be calculated based on the intermediate sign bit Msign via a direct connection.
[0104] The rounding calculation circuit C24 described above is implemented using AND-OR logic.
[0105] Figure 9a A schematic block diagram of a second pre-output calculation unit P2 according to one embodiment of the present disclosure is shown.
[0106] like Figure 9a As shown, the second pre-output parsing unit P2 can be configured to calculate the pre-output data bit Pdata and the pre-output sign bit Psign based on the second intermediate data bit ABS, the second intermediate sign bit Sign, the second intermediate exponent bit EXP, and the second intermediate rounding bit STK.
[0107] Further as Figure 9a As shown, the second pre-output calculation unit P2 may include a shift operator P21 and an adder P22, configured to generate a temporary output data bit ABS' and a pre-output sign bit Psign. The shift operator P2 is configured to shift the second intermediate data bit ABS by a second intermediate exponent bit EXP to obtain a shift result; the adder P22 is configured to generate the temporary data bit ABS' based on the shift result and the second intermediate rounding bit STK; the pre-output sign bit Psign is equivalent to the second intermediate sign bit Sign.
[0108] First, in the pre-output calculation unit P2, the received intermediate data bits ABS are shifted, and the amount and direction of the shift are determined by the intermediate exponent bit EXP. The resulting shift is then input into the next adder.
[0109] The adder's output is ABS' = the shifter's output + STK[-EXP-1]. If STK is out of range, it is set to zero. It's important to understand that STK is an array, such as a 32-bit array STK[31:0]. Here, STK[0] is the least significant element, and STK
[31] is the most significant element. When calculating -EXP-1, if the value is between 0 and 31, the corresponding value is taken; if it's less than 0, it's 0; if it's greater than 0, special handling is applied (taking 0 or 31 depending on the STK type).
[0110] In certain situations, such as when ABS' does not overflow, ABS' can be directly used as the output of the pre-output calculation unit P2.
[0111] Figure 9b A schematic block diagram of a second front-end output calculation unit P2 according to another embodiment of the present disclosure is shown.
[0112] like Figure 9b As shown, the pre-output calculation unit P2 further includes a selector P23, which determines whether the generated ABS' overflows. If it overflows, the ABS' is saturated; if it does not overflow, Pdata = ABS'.
[0113] Saturation handling is a special case handling method present in various arithmetic units. During computation, including rotations, situations may arise where the range of values for the input data and the output data differs: if the absolute value of the expected result is greater than the upper limit of the absolute value of the output data range, overflow occurs; if the absolute value of the expected result is less than the lower limit of the absolute value of the output data range, underflow occurs. Overflow situations are generally handled in several ways: using a saturation value, truncating high-order bits, using infinity, or a special value. This disclosure allows for saturation handling in any of these ways.
[0114] In addition, the intermediate sign bit SIGN is output as Psign via a straight-through line, meaning the sign remains unchanged.
[0115] In addition, Figure 9a and Figure 9b The pre-output exponent Pshift is not shown in the figure. If all data shifts have been completed, Pshift = 0.
[0116] Figure 9a and Figure 9bIn certain specific cases (e.g., both input and output are of type Fix and both have positive signs), the output data, such as the temporary output data bit ABS', the pre-output data bit Pdata, and the pre-output sign bit Psign, can directly become the second output data without further processing.
[0117] Figure 9a and Figure 9b Another embodiment of the second pre-output computing unit P2 of this disclosure is shown, in Figure 9a and Figure 9b The output Pdata and Psign can be sent externally for further processing.
[0118] Figure 10 A schematic diagram of the structure of a second data recovery unit R2 according to one embodiment of the present disclosure is shown.
[0119] like Figure 10 As shown, the second data recovery unit R2 may include a pre-output processing circuit R21, and preferably, may also include a data assembly circuit R22. The pre-output processing circuit R21 is configured to receive the pre-output data bit Pdata and the pre-output sign bit Psign to generate an output data bit representation Data_out; the data assembly circuit R22 is configured to generate a second type of data based on the output data bit representation Data_out and the pre-output sign bit Psign.
[0120] Data assembly and the data splicing described above can be the reverse operation, restoring the spliced data to the desired second type of data. Those skilled in the art can determine whether the assembly circuit is needed based on the actual data type. For example, for unspliced data, the data assembly circuit R22 may not be necessary; therefore, the data assembly circuit R22 is preferred but not required.
[0121] For example, if the input is a 32-bit Float type number and the output is a 32-bit Fix type number, no splicing or splitting occurs during input, so in terms of length, the data assembly circuit R22 is not needed.
[0122] like Figure 10 As shown, the pre-output processing circuit R21 in the second data recovery unit R2 receives... Figure 9a The temporary output data bit ABS' and the pre-sign bit Psign in Figure 9b can be used to obtain the output data bit representation Data_out.
[0123] For data of a specific type, such as non-negative Fix type data, the output data bit representation is equivalent to the preceding output data Pdata, without the need for special transformation or processing.
[0124] Considering that there are other data types such as Float, the pre-output processing circuit R21 in this disclosure is further configured to generate the floating-point decimal representation SHIFT_FP.
[0125] Further as Figure 10 As shown, the data assembly circuit R22 obtains the final second type of data based on the output data bit representation Data_out, the floating-point decimal point representation SHIFT_FP, and the pre-output sign bit Psign. It should be understood that in Figure 8, the floating-point decimal point representation SHIFT_FP is shown as a dashed line, indicating that SHIFT_FP may not exist in certain situations. In this case, the data assembly circuit R22 can be configured to obtain the second type of data based on the data output bit representation Data_out and the pre-output sign bit Psign.
[0126] Figure 11a A schematic block diagram of a pre-output processing circuit R21 according to one embodiment of the present disclosure is shown.
[0127] like Figure 11a As shown, the pre-output processing circuit R21 of this disclosure includes: a fourth selector and a second complement calculator.
[0128] exist Figure 11a In the process, the fourth selector receives the Pdata and the pre-output sign bit Psign. It then determines whether Psign is negative or non-negative, i.e., whether Psign equals 1 or 0.
[0129] If Psign = 1, the second complement calculator is invoked. This calculator includes a second inverter and a second adder. The second inverter first inverts all bits except the sign bit, and then the second adder adds 1. The result from the second complement calculator is then used as the output data bits, representing Data_out.
[0130] If Psign = 0, then Pdata is directly output as the output data bit representation Data_out.
[0131] Considering that there are multiple types of data, the output data bit Pdata can be judged in advance to determine how to proceed with further processing.
[0132] Figure 11bA schematic block diagram of R21 of a pre-output processing circuit according to another embodiment of the present disclosure is shown.
[0133] like Figure 11b As shown, the pre-output processing circuit R21 further includes: a third selector, a second normalizer, and a floating-point decimal point determiner.
[0134] The third selector receives the pre-output data bit Pdata, determines whether the data type of the pre-output data bit Pdata is Fix or Float. If the data type of the pre-output data bit Pdata is Fix, it sends the pre-output data bit Pdata to the fourth selector. If the data type of the pre-output data bit Pdata is Float, it sends the pre-output data bit Pdata to the second normalizer.
[0135] The second normalizer can normalize the pre-output data bit Pdata and output it as a data output bit representation Data_out.
[0136] In the definition of normalized numbers, normalization and denormalization are distinguished by a simple size comparison. If the absolute value is greater than the maximum representable absolute value (positive or negative saturation value), it cannot be represented, overflows, and is saturated; if the absolute value is less than the saturation value but greater than the normalization threshold, normalization is performed; if the absolute value is less than the normalization threshold but greater than the representable minimum value, denormalization is performed; if it is less than the representable minimum value, it underflows and is saturated (set to 0, the representable minimum value, or a special value). In the second transformation stage L2, normalization is simply removing the leading 1, and denormalization is shifting one bit to the right, which is the inverse of the normalization operation in the first transformation stage L1.
[0137] The floating-point decimal point position determiner can determine the number of decimal places represented by SHIFT_FP based on the output of the second normalizer.
[0138] It should be noted that the data in each stage mentioned above can maintain consistency in the number of bits across all stages. For example, if the first type of data is concatenated (e.g., two 16-bit data bits are concatenated into one 32-bit data bit), then the intermediate data bit Mdata can also be two concatenated data bits. Similarly, the second intermediate result (e.g., Sign, ABS, EXP, STK), the pre-output data (e.g., pre-output data bit Pdata, pre-output sign bit Psign), the output data bit representation Data_out, and the floating-point decimal point representation SHIFT_FP can all be two concatenated data bits. The concatenation method can be set according to the user's needs.
[0139] For the data assembly circuit R22, there may be several possibilities.
[0140] For example, for a 32-bit converter, if the input is a 16-bit Fix type number and the output is a 32-bit Fix type number, the 16-bit input can be converted to a 32-bit number simply by adding zeros to the high bits. The final output can then be a 32-bit number directly without any data assembly.
[0141] For example, for a 32-bit converter, if the input is a 32-bit Fix type number and the output is a 16-bit Fix type number, the input is converted normally in the first conversion stage, and the data obtained after conversion can be truncated by removing the high 16 bits to obtain the final 16-bit Fix type number.
[0142] It can be seen that the data assembly circuit R22 described above may not function in some cases, and is therefore not essential to this disclosure.
[0143] Furthermore, since the output data bits represented by the pre-output processing circuit R21, Data_out, and the number of decimal places represented by the floating-point number, SHIFT_FP, may be multiple data concatenated together, the data assembly circuit R22 can be used to transform or assemble the data into the final required data form. For example, the concatenated data can be split, and the various parts of the data (such as the valid data part and the sign part) can be assembled.
[0144] For example, the data in Data_out might be {0000 0000 0000 0000 0101 0011 00011010}, with the sign bit being {0001}. If the desired output is Fix8, the data assembly circuit R22 can extract two final second-type data points from this data: {0101 0011} and {0001 1010}, with signs of 0 and 1 respectively. Thus, the array assembly circuit can extract the final data from Data_out.
[0145] The first conversion stage L1 of this disclosure may also receive constraint information, which can be used to indicate whether the converter supports a specific standard and / or whether it supports compiler optimizations. The specific standard may be any known or unknown standard suitable for this disclosure, such as IEEE 754; the compiler optimization may be, for example, support for compiler behaviors -o0, -o1, etc.
[0146] It should be understood that the above description is only for specific examples, and these examples are merely for illustrative purposes and do not constitute any limitation on the scope of protection of this disclosure. The data types of the first and second types of data in this disclosure, as well as the content of the constraint information, can be extended in any way, and any existing or newly developed data types can be implemented using the technical solutions of this disclosure.
[0147] In the above text, when the intermediate data passes through the second transformation stage L2, it may exist in multiple states, for example... Figure 9a The output of the adder is ABS'. Figure 9b The selector's output Pdata, Figure 10 Figure 11a and Figure 11b The outputs of the preamplifier output processing circuit R21, such as Data_out, can be considered equivalent to second-type data (optionally, along with other auxiliary data). For example, ABS' can be equivalent to second-type data, and ABS'+Pdata can also be equivalent to second-type data; similarly, Pdata can be equivalent to second-type data, and Pdata+Psign can also be equivalent to second-type data, the only difference being the sign bit; and for another example, Data_out can be equivalent to second-type data, and Data_out+Shift_FP can also be equivalent to second-type data. It is important to understand that although these data at different stages are represented by different signs, they may be the same or equivalent data for some purposes. In other words, the "second-type data" referred to in this article could be any of the above data, only represented differently in different diagrams. For example, when the input number is of type Fix16, a positive number, and is extended to a 32-bit number, and the output is Fix32, then Pdata, after passing through the fourth selector (such as...), Figure 11a After being shown, it is assigned as Data_out for direct output. The data in Data_out itself conforms to the Fix32 format, so no further processing is required, and it can be directly output as second-type data.
[0148] The following will use specific examples to explain the various units, circuits, and components mentioned above.
[0149] Example 1
[0150] Example 1 provides an example of converting a Fix8 to a Float16.
[0151] Given input numbers 81 and 82, with data type fixed8, and concatenating two fixed8 numbers, the resulting hexadecimal number would be DATA = 32'h 00008182(0000 0000 0000 0000 1000 0001 10000010). In this representation, 32' indicates 32 bits, and h indicates hexadecimal. The first exponent is, for example, 2.
[0152] After concatenation, a 32-bit number is formed. The output after passing through the first data parsing unit, the first extraction unit E1, is as follows: the intermediate data bit Mdata is 32'h ff81ff82; the intermediate exponent bit Mshift is 2, identical to the original input; the extracted Sign is 0011, where only two numbers are valid (i.e., 11, the signs of 82 and 81 respectively), and invalid bits are 0; if the valid numbers are two negative numbers, the value is 1. That is, the intermediate sign bit Msign is 0011.
[0153] It is important to understand that the above description is based on the concatenated data. If a single data point is used as the object (e.g., 81) and described with the actual value (the data before concatenation), then the middle data bit Mdata is 81, the middle exponent bit Mshift is 2, and the middle sign bit Msign is 1.
[0154] Here, let the second exponent, Sshfit, be 3. Then the difference between the two, the first exponent minus the second exponent, is the difference exponent Dshfit, which is -1. In the case of 9 bits, this can be represented as 1 1111 1111
[0155] like Figure 6 , Figure 7a and Figure 7b As shown, after calculation, that is, after the second operation unit C2, we can obtain:
[0156] The second intermediate data bit ABS = 32'h 007f 007e, the input is of type Fix, and the complement is obtained after passing through the selector.
[0157] The second intermediate exponent EXP = -1 (1 1111 1111), which is equivalent to the transition exponent.
[0158] The second intermediate sign bit SIGN = 0011 (directly equal).
[0159] The second intermediate rounding place STK = 32'h 007f 007e (when rounding, STK = ABS).
[0160] Next, the second intermediate results ABS, EXP, SIGN, and STK are input into the second pre-output parsing unit P2 (e.g., Figures 6-9b As shown):
[0161] Using shift operator P21, since EXP = -1, the bit is shifted one bit to the right, resulting in the shifted value = 32'h 003f003f;
[0162] Using adder P22, the addition is, for example, STK[-EXP-1] = STK[0]. When there are two numbers, this corresponds to STK
[16] = 1 and STK[0] = 0. The high 16 bits of the adder output [31:16] = 16'h 003f + STK
[16] = 16'h 0040; the low 16 bits of the adder output [15:0] = 16'h 003f + STK[0] = 16'h 003f. Therefore, the adder output = 32'h0040 003f.
[0163] Through selector P23, it is clear that the output of adder P22 is relatively small, there is no overflow, and no exception is reported. And Pdata = adder output = 32'h 0040 003f = 0000 0000 0100 0000 0000 0000 0011 1111.
[0164] Next, the data enters the pre-output processing circuit R21, such as... Figure 10 As shown.
[0165] The output type is Float16, so Pdata is normalized, DATA_out = 32'h 0000 001f.
[0166] SHIFT_FP={6-15,5-15}={-9,-10}={10111,10110}
[0167] Next, the data is processed through the data assembly circuit R22, such as... Figure 10 As shown.
[0168] Assemble SIGN, SHIFT_FP DATA_out into two Float 16 type data.
[0169] Second type of data = {1,10111,0000000000,1,10110,0000011111}
[0170] =32'h dc00d81f
[0171] Example 2
[0172] Example 2 provides an example of converting Float16 to Fix8 with a first shift value of {0,0}.
[0173] Given the input DATA = 32'h c001 4401(1100 0000 0000 0001 0100 0100 0000 0001), the rounding mode is round to positive infinity.
[0174] like Figure 4 As shown, Mdata = 32'h 0401 0401(0000 0100 0000 0001 0000 0100 00000001)(Only two significant numbers, each with 11 bits, are used for sign bit extension. Since fp itself is represented in original code, the sign bit is padded with 0s.)
[0175] Mshift = {16, 17} (10000 10001) is a Float input, and the middle digits are directly equal.
[0176] Msign = 0010 (Only two numbers are valid, invalid numbers are set to 0; the valid numbers are one negative number and one positive number, so it is set to 10).
[0177] like Figure 6 As shown, the second shift value is (3,3). After calculation, that is, after passing through the second calculation unit C2, we can obtain:
[0178] ABS = 32'h 0401 0401, the input is Float, and the direct original code output is ABS = Mdata.
[0179] EXP = {16-15-(3), 17–15–(3)} = {-2, -1} (The input is of type Float, first take the offset value -15, then subtract it from the second shift value) = {11110 11111}
[0180] SIGN = 0010 (directly equal)
[0181] STK = 32'h 0000ffff. When rounding to positive infinity, in this example, the number of bits represented by the data is ABS[31:16], ABS[15:0]), then STK[n] = |ABS[n:x1] && ~SIGN, where x2>= n>=x1). For the high 16 bits of a 32-bit number, x2=31, x1=16; while for its low 16 bits, x2=15, x1=0.
[0182] Next, the intermediate results ABS, EXP, SIGN, and STK are input into the second pre-output parsing unit P2 (e.g., ...). Figures 6-9b (As shown)
[0183] Using shift operator P21, since EXP = {-2, -1}, shifting right by 2 and 1 bits respectively, we obtain the shifted result = 32'h0008 0010
[0184] Using adder P22, the addition is, for example, STK[-EXP-1] = STK[2], STK[1]. When there are two numbers, this corresponds to STK
[18] = 0, STK[1] = 1. The high 16 bits of the adder output [31:16] = 16'h 0008 + STK
[18] = 16'h 0008; the low 16 bits of the adder output [15:0] = 16'h 0010 + STK[1] = 16'h 0011. Therefore, the adder output = 32'h 0008 0011.
[0185] Using selector P23, the adder output is clearly small, with no overflow and no exception reported. Furthermore, Pdata = adder output = 32'h 0008 0011 = 0000 0000 0000 1000 0000 0000 0001 0001.
[0186] Next, the data enters the pre-output processing circuit R21, such as... Figure 10 As shown.
[0187] The output type is Fix, so Pdata is represented in two's complement form, DATA_out = 32'h fff8 0011
[0188] Next, the data is processed through the data assembly circuit R22, such as... Figure 10 As shown.
[0189] The obtained DATA_out is converted into two Fix8 type data, placed in the low bits, and the high 16 bits of invalid numbers are set to zero.
[0190] The second type of data is obtained as 32'h 0000f811.
[0191] This disclosure also provides a method based on the above-mentioned equipment, such as... Figure 2 As shown in the figures, other operations and steps of the method disclosed are not shown for simplification purposes. The operation of the method of this disclosure may be based on the specific devices, units and circuits described in this disclosure, but may also be based on other software, hardware, firmware, etc., and is not limited to the specific structures described above.
[0192] According to another aspect of this disclosure, an electronic device is also provided, comprising: one or more processors; and a memory storing computer-executable instructions that, when executed by the one or more processors, cause the electronic device to perform the method as described above.
[0193] According to another aspect of this disclosure, a computer-readable storage medium is also provided, including computer-executable instructions that, when executed by one or more processors, perform the method described above.
[0194] In traditional practical computing, data type conversion involves few conversion types and few constraints. Most of these conversions can be completed in a few clock cycles using simple software behaviors and instructions. More importantly, data type conversion instructions occur very infrequently.
[0195] In AI chips, due to varying precision requirements, data type conversion is likely to occur at each step of the calculation. Once this happens, it's not just a small number of calculations, but rather very intensive, large-scale computations with highly structured data. Using traditional data type conversion methods would result in significant memory access latency due to these large-scale, intensive calculations. Because data type conversion instructions occur frequently, this bottleneck can impact the overall computational performance of the processor core.
[0196] Furthermore, simple stacking of flip-flop instructions can lead to significant logical redundancy in the flip-flop module, resulting in excessively large local areas, overly dense wiring, and negatively impacting local processor performance. The following example illustrates this problem: In the data type conversion from Fix4 to fp16, the Fix4 input needs to be converted to absolute value form. Rounding bits are calculated based on this absolute value form. In the final stage of the data conversion, identical numerical data is converted from fixed-point representation to a 10-bit floating-point mantissa representation in normal or denormal form. Finally, the sign bit, exponent, and mantissa are used to concatenate the output numbers. In fact, the Fix4 to fp16 conversion process also involves the exact same first half of logic: converting the Fix4 input to absolute value form and calculating rounding bits based on this absolute value form. Similarly, the Fix8 to fp16 conversion process involves the exact same second half of logic: identical numerical data is converted from fixed-point representation to a 10-bit floating-point mantissa representation in normal or denormal form, and the sign bit, exponent, and mantissa are used to concatenate the output numbers. Simply expanding the instruction set will result in a large number of hardware operations involving repetitive logic and calculations (if the compiler behavior software controls the calculation of this part of the logic, then this redundant calculation does not disappear, but is repeated in the software implementation), which will affect processor performance.
[0197] The primary purpose of this intermediate result structure design is to reduce repetitive computational logic, decrease memory access latency compared to software implementation, and offer better scalability and portability. For example, as long as an intermediate result capable of representing any data type is obtained, it can be flexibly processed without necessarily employing the specific circuits and structures described in this disclosure. The content of this disclosure is also easily portable to other processing units, such as traditional CPUs and GPUs.
[0198] In the above embodiments disclosed herein, the descriptions of each embodiment have their own emphasis. For parts not described in detail in a particular embodiment, please refer to the relevant descriptions in other embodiments. The technical features of the above embodiments can be combined arbitrarily. For the sake of brevity, not all possible combinations of the technical features in the above embodiments have been described. However, as long as these combinations of technical features do not contradict each other, they should be considered within the scope of this specification.
[0199] The foregoing can be better understood in accordance with the following terms:
[0200] Clause A1. A converter for converting data types, comprising:
[0201] A first conversion stage (L1) is configured to receive first type data and first description information about the first type data, and convert the first type data into an intermediate result based on the first description information; and
[0202] The second conversion stage (L2) is configured to receive second description information about the second type of data and convert the intermediate result into the second type of data according to the second description information.
[0203] Clause A2. The converter according to Clause A1, wherein the first conversion stage (L1) includes a first receiving unit (Rx1) and a first extraction unit (E1), and the first description information includes a first data type of the first type of data and a first exponent of the first type of data.
[0204] The first receiving unit (Rx1) is configured to receive the first type of data and the first description information;
[0205] The first extraction unit (E1) is configured to extract the intermediate sign bit (Msign), intermediate data bit (Mdata), and intermediate exponent bit (Mshift) from the first type of data and the first description information as the intermediate result.
[0206] Clause A3. The converter according to Clause A1 or A2, wherein the first extraction unit (E1) includes: a sign bit calculation circuit (E11), a valid bit calculation circuit (E12), and an intermediate exponent bit calculation circuit (E13);
[0207] The sign bit calculation circuit (E11) is configured to extract the sign of the first type of data from the first type of data, as the intermediate sign bit (Msign);
[0208] The effective bit calculation circuit (E12) is configured to extract the effective data bits of the first type of data from the first type of data, as the intermediate data bits (Mdata); and
[0209] The intermediate exponent calculation circuit (E13) is configured to obtain the exponent information of the first type of data based on the first type of data or the first exponent, so as to serve as the intermediate exponent (Mshift).
[0210] Clause A4. The converter according to any one of Clauses A1-A3 further includes a memory configured to store the intermediate results.
[0211] Clause A5. The converter according to any one of Clauses A1-A4, wherein the second conversion stage (L2) includes a second computing unit (C2), a second pre-output parsing unit (P2), and a second data recovery unit (R2);
[0212] The second calculation unit (C2) is configured to receive the intermediate result and the second description information, and to calculate the second intermediate result based on the intermediate result and the second description information;
[0213] The second pre-output parsing unit (P2) is configured to calculate the pre-output data bits (Pdata) and the pre-output sign bits (Psign) based on the second intermediate result; and
[0214] The second data recovery unit (R2) is configured to generate a second type of data based on the pre-output data bits (Pdata) and the pre-output sign bits (Psign).
[0215] Clause A6. The converter according to any one of Clauses A1-A4, wherein the second description information includes a second data type of the second type of data and a second exponent bit (Sshift) of the second type of data, and the second calculation unit (C2) includes:
[0216] The absolute value calculation circuit (C21) is configured to calculate the second intermediate data bit (ABS) based on the intermediate data bit (Mdata);
[0217] The sign bit calculation circuit (C22) is configured to calculate the second intermediate sign bit (Sign) based on the intermediate sign bit (Msign);
[0218] The differential exponent bit calculation circuit (C23) is configured to calculate the differential exponent bit (Dshift) between the intermediate exponent bit (Mshift) and the second exponent bit (Sshift) as the second intermediate bit EXP.
[0219] Clause A7. The converter according to any one of clauses A1-A6, wherein the second computing unit (C2) further comprises:
[0220] The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS) and the second intermediate sign bit (Sign).
[0221] Clause A8. The converter according to any one of clauses A1-A7, wherein the second computing unit (C2) further comprises:
[0222] The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
[0223] Clause A9. The converter according to any one of Clauses A1-A8, wherein the absolute value calculation circuit (C21) comprises:
[0224] The second selector is configured to determine whether the intermediate data bit (Mdata) is less than zero;
[0225] The first complement calculator is configured to calculate the complement of the intermediate data bit (Mdata) if it is less than zero, and use it as the second intermediate data bit (ABS); otherwise...
[0226] The intermediate data bit (Mdata) is used as the intermediate data bit (ABS).
[0227] Clause A10. The converter according to any one of clauses A1-A9, wherein the absolute value calculation circuit (C21) further includes a first selector and a first normalizer.
[0228] The first selector is configured to determine whether the data type of the intermediate data bit (Mdata) is type 1 or type 2;
[0229] If the data type of the intermediate data bit (Mdata) is type 1, then the second selector is selected for processing;
[0230] If the data type of the intermediate data bit (Mdata) is type 2, then the first normalizer is selected for processing;
[0231] The first normalizer is configured to normalize the intermediate data bit (Mdata) as a second intermediate data bit (ABS) when the data type of the intermediate data bit (Mdata) is type 2.
[0232] Clause A11. The converter according to any one of Clauses A1-A10, wherein,
[0233] The sign bit calculation circuit (C22) is a direct connection.
[0234] Clause A12. The converter according to any one of clauses A1-A11, wherein the second pre-output parsing unit (P2) comprises:
[0235] The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
[0236] Clause A13. The converter according to any one of Clauses A1-A12, wherein the rounding bit calculation circuit (C24) is implemented by AND-OR logic.
[0237] Clause A14. The converter according to any one of Clauses A1-A13, wherein the second pre-output parsing unit (P2) is configured to calculate the pre-output data bit (Pdata) and the pre-output sign bit (Psign) based on the second intermediate data bit (ABS), the second intermediate sign bit (Sign), the second intermediate exponent bit (EXP), and the second intermediate rounding bit (STK).
[0238] Clause A15. The converter according to any one of Clauses A1-A14, wherein the second pre-output calculation unit (P2) comprises: a shift operator (P21) and an adder (P22), configured to generate a temporary output data bit (ABS') and a pre-output sign bit (Psign), wherein
[0239] The shift operator (P21) is configured to shift the second intermediate data bit (ABS) by the second intermediate exponent bit (EXP) to obtain the shift result;
[0240] The adder (P22) is configured to generate a temporary data bit (ABS') based on the shift result and the second intermediate rounding bit (STK);
[0241] The pre-output sign bit (Psign) is equivalent to the second intermediate sign bit (Sign).
[0242] Clause A16. In the converter according to any one of Clauses A1-A15, the pre-output calculation unit (P2) further includes a selector (P23) configured to detect whether the temporary data bit (ABS') is greater than a saturation value.
[0243] If it is greater than, the temporary data bit (ABS') is saturated to obtain the pre-output data bit (Pdata);
[0244] If it is not greater than, then the temporary data bit (ABS') is output as the pre-output data bit (Pdata).
[0245] Clause A17. The converter according to any one of Clauses A1-A16, wherein the second data recovery unit (R2) includes a pre-output processing circuit (R21) and a data assembly circuit (R22):
[0246] The pre-output processing circuit (R21) is configured to receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign) to generate an output data bit representation (Data_out);
[0247] The data assembly circuit (R22) is configured to generate a second type of data based on the output data bit representation (Data_out) and the pre-output sign bit (Psign).
[0248] Clause A18. The converter according to any one of Clauses A1-A17, wherein the pre-output processing circuit (R21) is further configured to generate a floating-point fractional bit representation (SHIFT_FP), and the data assembly circuit (R22) is configured to generate the second type of data based on the data output bit representation (Data_out), the floating-point fractional bit representation (Shift_FP), and the pre-output sign bit (Psign).
[0249] Clause A19. The converter according to any one of Clauses A1-A20, wherein the pre-output processing circuit (R21) comprises: a fourth selector and a second complement calculator.
[0250] The fourth selector is configured to receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign). If the pre-output sign bit (Psign) is negative, the pre-output data bit is output to the second complement calculator. If the pre-output sign bit (Psign) is positive and non-negative, the pre-output data bit is output as the data output bit representation (Data_out).
[0251] The second complement calculator is configured to calculate the complement of the preceding output data bits (Pdata).
[0252] Clause A20. The converter according to any one of Clauses A1-A19, wherein the pre-output processing circuit (R21) further comprises: a third selector, a second normalizer, and a floating-point decimal point determiner, wherein
[0253] The third selector is configured to receive the pre-output data bit (Pdata), determine whether the data type of the pre-output data bit (Pdata) is type 1 or type 2, and if the data type of the pre-output data bit (Pdata) is type 1, then send the pre-output data bit (Pdata) to the fourth selector; if the data type of the pre-output data bit (Pdata) is type 2, then send the pre-output data bit (Pdata) to the second normalizer.
[0254] The second normalizer is configured to normalize the pre-output data bits (Pdata) and output them as a data output bit representation (Data_out);
[0255] The floating-point decimal point position determiner is configured to determine the number of floating-point decimal places (SHIFT_FP) based on the output of the second normalizer.
[0256] Clause A21. The converter according to any one of Clauses A1-A20, wherein the first processing stage (L1) is further configured to determine the number of received first type data and concatenate the number of first type data to form first concatenated data, and the first processing stage (L1) converts the first concatenated data into an intermediate result according to the first description information.
[0257] Clause A22. A converter according to any one of Clauses A1-A21, wherein the number of received intermediate results is determined by:
[0258] Divide the number of bits processed by the converter by the number of bits of the first type of data; or
[0259] The preset first fixed value.
[0260] Clause A23. The converter according to any one of Clauses A1-A22, wherein the first processing stage (L1) is further configured to determine the number of received first type data to be split, and to split the first type data into the number of split data, the first processing stage (L1) converting the split data into an intermediate result according to the first description information.
[0261] Clause A24. A converter according to any one of Clauses A1-A23, wherein the number of intermediate results to be split is determined by:
[0262] Divide the number of bits of the first type of data by the number of bits processed by the converter; or
[0263] The second fixed value is preset.
[0264] Clause A25. In any of the clauses A1-A24, the first conversion stage (L1) and / or the second conversion stage (L2) are further configured to receive constraint information indicating whether a specific standard is supported and / or whether compilation optimization is supported.
[0265] Clause A26. A converter according to any one of Clauses A1-A25, wherein the data types of the first type of data and the second type of data are extensible.
[0266] Clause A27. A chip comprising a converter as described in any one of Clauses A1-A26.
[0267] Clause A28. A computing device comprising a converter as described in any one of Clauses A1-A27 or a chip as described in Clause A28.
[0268] Clause A29. A method for converting data types, comprising:
[0269] Receive first type data and first description information about the first type data, and convert the first type data into an intermediate result based on the first description information; and
[0270] Receive second description information about the second type of data, and convert the intermediate result into the second type of data according to the second description information.
[0271] Clause A30. The method according to Clause A29, wherein the first description information includes a first data type of the first type of data and a first exponent of the first type of data, receiving the first type of data and the first description information about the first type of data, and converting the first type of data into an intermediate result based on the first description information, including...
[0272] Receive the first type of data and the first description information;
[0273] The intermediate sign bit (Msign), intermediate data bit (Mdata), and intermediate exponent bit (Mshift) are extracted from the first type of data and the first description information to serve as the intermediate result.
[0274] Clause A31. The method according to Clause A29 or A30, wherein extracting the intermediate sign bit (Msign), intermediate data bit (Mdata), and intermediate exponent bit (Mshift) from the first type of data and the first description information as the intermediate result includes:
[0275] Extract the symbol of the first type of data from the first type of data to serve as the intermediate symbol bit (Msign);
[0276] Extract the valid data bits of the first type of data from the first type of data to serve as the intermediate data bits (Mdata); and
[0277] The exponent information of the first type of data is obtained based on the first type of data or the first exponent position, and is used as the intermediate exponent position (Mshift).
[0278] Clause A32. The intermediate result is further stored in memory according to any one of Clauses A29-A31.
[0279] Clause A33. The method of claim 31, wherein receiving second description information about the second type of data and converting the intermediate result into second type of data based on the second description information comprises:
[0280] Receive the intermediate result and the second description information, and calculate the second intermediate result based on the intermediate result and the second description information;
[0281] Calculate the pre-output data bits (Pdata) and pre-output sign bits (Psign) based on the second intermediate result; and
[0282] The second type of data is generated based on the pre-output data bits (Pdata) and the pre-output sign bits (Psign).
[0283] Clause A34. The method of claim 33, wherein the second descriptive information includes a second data type of the second type of data and a second exponent (Sshift) of the second type of data, and calculating the second intermediate result based on the intermediate result and the second descriptive information includes:
[0284] The second intermediate data bit (ABS) is calculated based on the intermediate data bit (Mdata);
[0285] The second intermediate sign bit (Sign) is calculated based on the intermediate sign bit (Msign);
[0286] The difference index (Dshift) between the intermediate index (Mshift) and the second index (Sshift) is calculated as the second intermediate index EXP.
[0287] Clause A35. The method of claim 34, wherein calculating the second intermediate result based on the intermediate result and the second descriptive information further comprises:
[0288] The second intermediate rounding bit (STK) is calculated based on the second intermediate data bit (ABS) and the second intermediate sign bit (Sign).
[0289] Clause A36. The method of claim 34, wherein calculating the second intermediate result based on the intermediate result and the second descriptive information further comprises:
[0290] The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
[0291] Clause A37. The method according to any one of claims 34-36, wherein calculating the second intermediate data bit (ABS) based on the intermediate data bit (Mdata) comprises:
[0292] Determine whether the intermediate data bit (Mdata) is less than zero;
[0293] If the intermediate data bit (Mdata) is less than zero, then the two's complement of the intermediate data bit (Mdata) is calculated and used as the second intermediate data bit (ABS); otherwise...
[0294] The intermediate data bit (Mdata) is used as the intermediate data bit (ABS).
[0295] Clause A38. The method of claim 37, wherein calculating the second intermediate data bit (ABS) based on the intermediate data bit (Mdata) further comprises:
[0296] Determine whether the data type of the intermediate data bit (Mdata) is type 1 or type 2;
[0297] If the data type of the intermediate data bit (Mdata) is type 1, then
[0298] Determine whether the intermediate data bit (Mdata) is less than zero;
[0299] If the intermediate data bit (Mdata) is less than zero, then calculate the intermediate data bit.
[0300] The complement of (Mdata) is used as the second intermediate data bit (ABS); otherwise
[0301] The intermediate data bit (Mdata) is used as the intermediate data bit (ABS).
[0302] If the data type of the intermediate data bit (Mdata) is type II, then
[0303] The intermediate data bits (Mdata) are normalized to serve as the second intermediate data bits (ABS).
[0304] Clause A39. The method of claim 38, wherein calculating the pre-output data bit (Pdata) and the pre-output sign bit (Psign) based on the second intermediate result comprises:
[0305] The second intermediate rounding bit (STK) is calculated based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
[0306] Clause A40. The method of claim 35, 36 or 39, wherein the calculation of the second intermediate rounding bit (STK) is implemented by AND-OR logic.
[0307] Clause A41. The method of claim 35, 36, or 39, wherein calculating the pre-output data bit (Pdata) and the pre-output sign bit (Psign) based on the second intermediate result comprises:
[0308] The pre-output data bit (Pdata) and the pre-output sign bit (Psign) are calculated based on the second intermediate data bit (ABS), the second intermediate sign bit (Sign), the second intermediate exponent bit (EXP), and the second intermediate rounding bit (STK).
[0309] Clause A42. The method of claim 41, wherein calculating the pre-output data bit (Pdata) and the pre-output sign bit (Psign) based on the second intermediate result comprises:
[0310] The second intermediate data bit (ABS) is shifted by the second intermediate exponent bit (EXP) to obtain the shift result;
[0311] A temporary data bit (ABS') is generated based on the shift result and the second intermediate rounding bit (STK);
[0312] The pre-output sign bit (Psign) is equivalent to the second intermediate sign bit (Sign).
[0313] Clause A43. The method of claim 42, wherein calculating the pre-output data bit (Pdata) and the pre-output sign bit (Psign) based on the second intermediate result further comprises:
[0314] Check whether the temporary data bit (ABS') is greater than the saturation value.
[0315] If it is greater than, the temporary data bit (ABS') is saturated to obtain the pre-output data bit (Pdata);
[0316] If it is not greater than, then the temporary data bit (ABS') is output as the pre-output data bit (Pdata).
[0317] Clause A44. The method according to any one of claims 41-43, wherein generating the second type of data based on the pre-output data bit (Pdata) and the pre-output sign bit (Psign) comprises:
[0318] Receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign) to generate the output data bit representation (Data_out);
[0319] The second type of data is generated based on the output data bit representation (Data_out) and the pre-output sign bit (Psign).
[0320] Clause A45. The method of claim 44, wherein receiving the pre-output data bit (Pdata) and the pre-output sign bit (Psign) to generate an output data bit representation (Data_out) further comprises: generating a floating-point decimal point representation (SHIFT_FP); generating a second type of data based on the output data bit representation (Data_out) and the pre-output sign bit (Psign) comprises: generating the second type of data based on the data output bit representation (Data_out), the floating-point decimal point representation (Shift_FP), and the pre-output sign bit (Psign).
[0321] Clause A46. The method of claim 44 or 45, wherein receiving the pre-output data bits (Pdata) and the pre-output sign bits (Psign) to generate the output data bit representation (Data_out) comprises:
[0322] Receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign). If the pre-output sign bit (Psign) is negative, then take the two's complement of the pre-output data bit (Pdata). If the pre-output sign bit (Psign) is positive and non-negative, then output the pre-output data bit (Psign) as the data output bit representation (Data_out).
[0323] Clause A47. The method of claim 46, wherein receiving the pre-output data bits (Pdata) and the pre-output sign bits (Psign) to generate the output data bit representation (Data_out) further comprises:
[0324] The system receives the pre-output data bit (Pdata), determines whether the data type of the pre-output data bit (Pdata) is type I or type II, and if the data type of the pre-output data bit (Pdata) is type I, then...
[0325] If the pre-output sign bit (Psign) is negative, then the pre-output number...
[0326] Calculate the complement of the bit (Pdata), if the preceding output sign bit (Psign) is a positive non-positive number.
[0327] If the number is negative, the preceding output data bit (Psign) will be output as the data output bit representation (Data_out);
[0328] If the data type of the pre-output data bit (Pdata) is type II, then
[0329] The pre-output data bits (Pdata) are normalized and output as data output.
[0330] Bit representation (Data_out);
[0331] The number of decimal places for a floating-point number (SHIFT_FP) is determined based on the output of the normalizer.
[0332] Clause A48. The method according to any one of claims 29-47, wherein receiving first type data and first description information about the first type data, and converting the first type data into an intermediate result based on the first description information, further comprises:
[0333] The number of received first-type data is determined, and the number of first-type data is concatenated to form first concatenated data. Based on the first description information, the first concatenated data is converted into an intermediate result.
[0334] Clause A49. The method of claim 48, wherein the number of received intermediate results is determined by:
[0335] Divide the number of bits processed by the converter used in this invention by the number of bits of the first type of data; or
[0336] The preset first fixed value.
[0337] Clause A50. The method according to any one of claims 29-47, wherein receiving first type data and first description information about the first type data, and converting the first type data into an intermediate result based on the first description information, further comprises:
[0338] Determine the number of the first type of data to be split, and split the first type of data into the number of split data. Based on the first description information, convert the split data into intermediate results.
[0339] Clause A51. The method of claim 50, wherein the number of intermediate results to be split is determined by:
[0340] Divide the number of bits of the first type of data by the number of bits processed by the converter used in this invention; or
[0341] The second fixed value is preset.
[0342] Clause A52. The method according to any one of claims 29-51, wherein the method is further configured to receive constraint information for indicating whether a specific standard is supported, and / or whether compilation optimization is supported.
[0343] Clause A53. The method according to any one of claims 29-52, wherein the data types of the first type of data and the second type of data are extensible.
[0344] Clause A54. An electronic device comprising: one or more processors; and a memory storing computer-executable instructions that, when executed by the one or more processors, cause the electronic device to perform the method as described in any one of Clauses A29-A53.
[0345] Clause A55. A computer-readable storage medium comprising computer-executable instructions that, when executed by one or more processors, perform the methods described in any one of Clauses A29-A53. This disclosure also discloses a combined processing apparatus 1200 comprising the aforementioned computing device 1202, a general interconnect interface 1204, and other processing devices 1206. The computing device according to this disclosure interacts with other processing devices to jointly perform user-specified operations. Figure 12 This is a schematic diagram of the combined processing device.
[0346] Other processing devices include one or more processor types such as central processing unit (CPU), graphics processing unit (GPU), and neural network processor. There is no limit to the number of processors included in other processing devices. These other processing devices serve as interfaces between the machine learning computing device and external data and control, including data transfer and basic control such as starting and stopping the machine learning computing device. Other processing devices can also collaborate with the machine learning computing device to complete computational tasks.
[0347] A general interconnect interface is used to transfer data and control commands between a computing device (including, for example, a machine learning computing device) and other processing devices. The computing device can obtain required input data from other processing devices and write it to on-chip storage; it can obtain control commands from other processing devices and write them to on-chip control caches; and it can read data from the computing device's storage modules and transmit it to other processing devices.
[0348] Optionally, the structure may further include a storage device 1208, which is connected to both the computing device and the other processing device. The storage device is used to store data in the computing device and the other processing device, and is particularly suitable for data that cannot be fully stored in the internal storage of the computing device or other processing device, requiring computation.
[0349] This combined processing device can serve as a System-on-a-Chip (SoC) for devices such as mobile phones, robots, drones, and video surveillance equipment, effectively reducing the core area of the control unit, increasing processing speed, and lowering overall power consumption. In this case, the universal interconnect interface of the combined processing device connects to certain components of the device, such as cameras, monitors, mice, keyboards, network cards, and Wi-Fi interfaces.
[0350] In some embodiments, this disclosure also discloses a chip that includes the aforementioned computing device or combined processing device.
[0351] In some embodiments, this disclosure also discloses a chip package structure that includes the aforementioned chip.
[0352] In some embodiments, this disclosure also discloses a circuit board that includes the above-described chip packaging structure. See also... Figure 13 The present invention provides an exemplary board, which, in addition to the chip 1302, may also include other supporting components, including but not limited to: a storage device 1304, an interface device 1306, and a controller 1308.
[0353] The storage device is connected to the chip within the chip package structure via a bus for storing data. The storage device may include multiple sets of storage cells 1310. Each set of storage cells is connected to the chip via a bus. It is understood that each set of storage cells may be DDR SDRAM (Double Data Rate SDRAM).
[0354] DDR can double the speed of SDRAM without increasing the clock frequency. DDR allows data to be read on both the rising and falling edges of the clock pulse. DDR is twice as fast as standard SDRAM. In one embodiment, the storage device may include four groups of storage cells. Each group of storage cells may include multiple DDR4 chips. In one embodiment, the chip may internally include four 72-bit DDR4 controllers, of which 64 bits are used for data transmission and 8 bits are used for ECC verification. In one embodiment, each group of storage cells includes multiple Double Data Rate Synchronous Dynamic Random Access Memory (DRAM) units connected in parallel. DDR can transmit data twice per clock cycle. A controller for controlling DDR is provided in the chip for controlling data transmission and data storage in each storage cell.
[0355] The interface device is electrically connected to the chip within the chip package structure. The interface device is used to realize data transmission between the chip and an external device 1312 (e.g., a server or computer). For example, in one embodiment, the interface device can be a standard PCIe interface. For instance, data to be processed is transferred from the server to the chip via a standard PCIe interface, realizing data transfer. In another embodiment, the interface device can also be other interfaces; this disclosure does not limit the specific form of the other interfaces mentioned above, as long as the interface unit can realize the switching function. Furthermore, the calculation results of the chip are still transmitted back to the external device (e.g., the server) by the interface device.
[0356] The controller is electrically connected to the chip. The controller monitors the state of the chip. Specifically, the chip and the controller can be electrically connected via an SPI interface. The controller may include a microcontroller (MCU). The chip may include multiple processing chips, multiple processing cores, or multiple processing circuits, capable of driving multiple loads. Therefore, the chip can operate in different states, such as high load and low load. The controller can regulate the operating states of multiple processing chips, multiple processing cores, and / or multiple processing circuits within the chip.
[0357] In some embodiments, this disclosure also discloses an electronic device or apparatus that includes the aforementioned board.
[0358] Electronic devices or apparatuses include data processing devices, robots, computers, printers, scanners, tablet computers, smart terminals, mobile phones, dashcams, navigators, sensors, cameras, servers, cloud servers, cameras, camcorders, projectors, watches, headphones, mobile storage, wearable devices, vehicles, home appliances, and / or medical devices.
[0359] The means of transportation include airplanes, ships and / or vehicles; the household appliances include televisions, air conditioners, microwave ovens, refrigerators, rice cookers, humidifiers, washing machines, lights, gas stoves, and range hoods; the medical equipment includes MRI scanners, ultrasound scanners and / or electrocardiographs.
[0360] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this disclosure is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this disclosure. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are optional embodiments, and the actions and modules involved are not necessarily essential to this disclosure.
[0361] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.
[0362] In the embodiments provided in this disclosure, it should be understood that the disclosed apparatus can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the displayed or discussed mutual couplings, direct couplings, or communication connections may be through some interfaces; indirect couplings or communication connections between devices or units may be electrical, optical, acoustic, magnetic, or other forms.
[0363] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0364] Furthermore, the functional units in the various embodiments disclosed herein can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software program module.
[0365] If the integrated unit is implemented as a software program module and sold or used as an independent product, it can be stored in a computer-readable storage device (CMD). Based on this understanding, when the technical solution disclosed herein can be embodied in the form of a software product, the computer software product is stored in a storage device and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this disclosure. The aforementioned storage device includes various media capable of storing program code, such as a USB flash drive, read-only memory (ROM), random access memory (RAM), portable hard drive, magnetic disk, or optical disk.
[0366] The embodiments of this disclosure have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this disclosure. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this disclosure. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this disclosure. Therefore, the content of this specification should not be construed as a limitation of this disclosure.
Claims
1. A converter for converting data types, comprising: The first conversion stage (L1) is configured to receive first type data and first description information about the first type data, and convert the first type data into an intermediate result based on the first description information. as well as The second conversion stage (L2) is configured to receive second description information about the second type of data and convert the intermediate result into the second type of data according to the second description information; The intermediate result includes intermediate data bits (Mdata); the first transformation stage (L1) includes a first extraction unit (E1); the first extraction unit (E1) is configured to extract intermediate data bits (Mdata) from the first type of data and the first description information. The second conversion stage (L2) includes a second computing unit (C2); the second computing unit (C2) is configured to receive the intermediate result and the second description information, and to calculate a second intermediate result based on the intermediate result and the second description information; The second intermediate result includes the second intermediate data bits (ABS); The second calculation unit (C2) includes: an absolute value calculation circuit (C21) configured to calculate a second intermediate data bit (ABS) based on the intermediate data bit (Mdata); The second transformation level (L2) generates the second type of data based on the second intermediate result.
2. The converter according to claim 1, wherein, The first conversion stage (L1) includes a first receiving unit (Rx1), and the first description information includes a first data type of the first type of data and a first exponent bit of the first type of data. The first receiving unit (Rx1) is configured to receive the first type of data and the first description information; The first extraction unit (E1) is configured to extract the intermediate sign bit (Msign), intermediate data bit (Mdata), and intermediate exponent bit (Mshift) from the first type of data and the first description information as the intermediate result.
3. The converter according to claim 2, wherein, The first extraction unit (E1) includes: a sign bit calculation circuit (E11), a valid bit calculation circuit (E12), and an intermediate exponent bit calculation circuit (E13). The sign bit calculation circuit E11 is configured to extract the sign of the first type of data from the first type of data, as the intermediate sign bit (Msign). The effective bit calculation circuit (E12) is configured to extract the effective data bits of the first type of data from the first type of data as the intermediate data bits (Mdata); and The intermediate exponent calculation circuit (E13) is configured to obtain the exponent information of the first type of data based on the first type of data or the first exponent, so as to serve as the intermediate exponent (Mshift).
4. The converter according to any one of claims 1-3, further comprising a memory configured to store the intermediate results.
5. The converter according to claim 2, wherein, The second conversion stage (L2) also includes a second pre-output parsing unit (P2) and a second data recovery unit (R2); The second pre-output parsing unit (P2) is configured to calculate the pre-output data bits (Pdata) and the pre-output sign bits (Psign) based on the second intermediate result; and The second data recovery unit (R2) is configured to generate a second type of data based on the pre-output data bits (Pdata) and the pre-output sign bits (Psign).
6. The converter according to claim 5, wherein, The second description information includes the second data type of the second type of data and the second exponent bit (Sshift) of the second type of data; the second intermediate result also includes the second intermediate sign bit (Sign) and the second intermediate exponent bit (EXP); the second calculation unit (C2) also includes: The sign bit calculation circuit C22 is configured to calculate the second intermediate sign bit (Sign) based on the intermediate sign bit (Msign). The differential exponent bit calculation circuit (C23) is configured to calculate the differential exponent bit (Dshift) between the intermediate exponent bit (Mshift) and the second exponent bit (Sshift) as the second intermediate exponent bit (EXP).
7. The converter according to claim 6, wherein, The second computing unit (C2) further includes: The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS) and the second intermediate sign bit (Sign).
8. The converter according to claim 6, wherein, The second computing unit (C2) further includes: The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
9. The converter according to any one of claims 6-8, wherein, The absolute value calculation circuit (C21) includes: The second selector is configured to determine whether the intermediate data bit (Mdata) is less than zero; The first complement calculator is configured to calculate the complement of the intermediate data bit (Mdata) if the intermediate data bit (Mdata) is less than zero, and use it as the second intermediate data bit (ABS); otherwise... The intermediate data bit (Mdata) is used as the intermediate data bit (ABS).
10. The converter according to claim 9, wherein, The absolute value calculation circuit (C21) further includes a first selector and a first normalizer. The first selector is configured to determine whether the data type of the intermediate data bit (Mdata) is type 1 or type 2; If the data type of the intermediate data bit (Mdata) is type 1, then the second selector is selected for processing; If the data type of the intermediate data bit (Mdata) is type II, then the first normalizer is selected for processing; The first normalizer is configured to normalize the intermediate data bit (Mdata) as a second intermediate data bit (ABS) when the data type of the intermediate data bit (Mdata) is type 2.
11. The converter according to any one of claims 6-8, wherein, The sign bit calculation circuit C22 is a direct connection.
12. The converter according to claim 6, wherein, The second pre-output parsing unit (P2) includes: The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
13. The converter according to claim 7, 8 or 12, wherein, The rounding bit calculation circuit (C24) is implemented using AND-OR logic.
14. The converter according to claim 7 or 8, wherein, The second pre-output parsing unit (P2) is configured to calculate the pre-output data bit (Pdata) and the pre-output sign bit (Psign) based on the second intermediate data bit (ABS), the second intermediate sign bit (Sign), the second intermediate exponent bit (EXP), and the second intermediate rounding bit (STK).
15. The converter according to claim 14, wherein, The second pre-output parsing unit (P2) includes a shift operator (P21) and an adder (P22), configured to generate temporary output data bits (ABS') and a pre-output sign bit (Psign), wherein... The shift operator (P21) is configured to shift the second intermediate data bit (ABS) by the second intermediate exponent bit (EXP) to obtain the shift result; The adder (P22) is configured to generate a temporary data bit (ABS') based on the shift result and the second intermediate rounding bit (STK); The pre-output sign bit (Psign) is equivalent to the second intermediate sign bit (Sign).
16. The converter of claim 15, wherein the second pre-output parsing unit (P2) further includes a selector (P23) configured to detect whether the temporary data bit (ABS') is greater than a saturation value. If it is greater than the specified value, the temporary data bit (ABS') is saturated to obtain the pre-output data bit (Pdata). If it is not greater than, the temporary data bit (ABS') is output as the pre-output data bit (Pdata).
17. The converter according to claim 14, wherein, The second data recovery unit (R2) includes a pre-output processing circuit (R21) and a data assembly circuit (R22): The pre-output processing circuit (R21) is configured to receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign) to generate an output data bit representation (Data_out). The data assembly circuit (R22) is configured to generate a second type of data based on the output data bit representation (Data_out) and the pre-output sign bit (Psign).
18. The converter according to claim 17, wherein, The pre-output processing circuit (R21) is further configured to generate a floating-point decimal point representation (SHIFT_FP), and the data assembly circuit (R22) is configured to generate the second type of data based on the data output bit representation (Data_out), the floating-point decimal point representation (Shift_FP), and the pre-output sign bit (Psign).
19. The converter according to claim 17 or 18, wherein, The pre-output processing circuit (R21) includes: a fourth selector and a second complement calculator. The fourth selector is configured to receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign). If the pre-output sign bit (Psign) is negative, the pre-output data bit is output to the second two's complement calculator. If the pre-output sign bit (Psign) is positive and non-negative, the pre-output data bit is output as the data output bit representation (Data_out). The second complement calculator is configured to calculate the complement of the preceding output data bits (Pdata).
20. The converter according to claim 19, wherein, The pre-output processing circuit (R21) further includes: a third selector, a second normalizer, and a floating-point decimal point determiner, wherein... The third selector is configured to receive the pre-output data bit (Pdata), determine whether the data type of the pre-output data bit (Pdata) is type 1 or type 2, and if the data type of the pre-output data bit (Pdata) is type 1, then send the pre-output data bit (Pdata) to the fourth selector; if the data type of the pre-output data bit (Pdata) is type 2, then send the pre-output data bit (Pdata) to the second normalizer. The second normalizer is configured to normalize the pre-output data bits (Pdata) and output them as a data output bit representation (Data_out). The floating-point decimal point position determiner is configured to determine the number of floating-point decimal places (SHIFT_FP) based on the output of the second normalizer.
21. The converter according to any one of claims 1-3, wherein, The first conversion stage (L1) is further configured to determine the number of received first type data and concatenate the number of first type data to form first concatenated data. The first conversion stage (L1) converts the first concatenated data into an intermediate result according to the first description information.
22. The converter of claim 20, wherein, The number of intermediate results received is determined as follows: Divide the number of bits processed by the converter by the number of bits of the first type of data; or The preset first fixed value.
23. The converter according to any one of claims 1-3, wherein, The first transformation stage (L1) is further configured to determine the number of first type data to be split and split the first type data into the number of split data, and the first transformation stage (L1) converts the split data into intermediate results according to the first description information.
24. The converter according to claim 23, wherein, The number of intermediate results to be split is determined as follows: Divide the number of bits of the first type of data by the number of bits processed by the converter; or The second fixed value is preset.
25. The converter according to any one of claims 1-3, wherein the first conversion stage (L1) and / or the second conversion stage (L2) are further configured to receive constraint information, the constraint information being used to indicate whether a specific standard is supported, and / or whether compilation optimization is supported.
26. The converter according to any one of claims 1-3, wherein, The data types of both Type I and Type II data are extensible.
27. A chip comprising the converter as described in any one of claims 1-26.
28. A computing device comprising a converter as claimed in any one of claims 1-26 or a chip as claimed in claim 27.
29. A method for converting data types, comprising: Receive first type data and first description information about the first type data, and convert the first type data into an intermediate result based on the first description information; as well as Receive second description information about the second type of data, and convert the intermediate result into the second type of data according to the second description information; The method further includes: The second intermediate result is calculated based on the intermediate result and the second description information; wherein the intermediate result includes intermediate data bits (Mdata); and the second intermediate result includes second intermediate data bits (ABS). Generate the second type of data based on the second intermediate result; Extract intermediate data bits (Mdata) from the first type of data and the first description information; The second intermediate data bit (ABS) is calculated based on the intermediate data bit (Mdata).
30. The method according to claim 29, wherein, The first description information includes a first data type and a first exponent of the first data type. Receiving the first data type and the first description information about the first data type, and converting the first data type into an intermediate result based on the first description information, includes: Receive the first type of data and the first description information; The intermediate sign bit (Msign), intermediate data bit (Mdata), and intermediate exponent bit (Mshift) are extracted from the first type of data and the first description information to serve as the intermediate result.
31. The method according to claim 30, wherein, Extracting the intermediate sign bit (Msign), intermediate data bit (Mdata), and intermediate exponent bit (Mshift) from the first type of data and the first description information as the intermediate result includes: Extract the symbol of the first type of data from the first type of data to serve as the intermediate symbol bit (Msign). Extract the valid data bits of the first type of data from the first type of data to serve as the intermediate data bits (Mdata); and The exponent information of the first type of data is obtained based on the first type of data or the first exponent position, and is used as the intermediate exponent position (Mshift).
32. The method according to any one of claims 30-31, further comprising storing the intermediate results in a memory.
33. The method according to claim 31, wherein, Receiving second description information about the second type of data, and converting the intermediate result into the second type of data based on the second description information, includes: Calculate the pre-output data bits (Pdata) and pre-output sign bits (Psign) based on the second intermediate result; and The second type of data is generated based on the pre-output data bits (Pdata) and the pre-output sign bits (Psign).
34. The method according to claim 33, wherein, The second intermediate result also includes a second intermediate sign bit and a second intermediate exponent bit (EXP); the second descriptive information includes a second data type of the second type of data and a second exponent bit (Sshift) of the second type of data. Calculating the second intermediate result based on the intermediate result and the second descriptive information includes: The second intermediate sign bit (Sign) is calculated based on the intermediate sign bit (Msign). The difference index (Dshift) between the intermediate index (Mshift) and the second index (Sshift) is calculated as the second intermediate index (EXP).
35. The method according to claim 34, wherein, Calculating the second intermediate result based on the intermediate result and the second descriptive information further includes: The second intermediate rounding bit (STK) is calculated based on the second intermediate data bit (ABS) and the second intermediate sign bit (Sign).
36. The method according to claim 34, wherein, Calculating the second intermediate result based on the intermediate result and the second descriptive information further includes: The rounding bit calculation circuit (C24) is configured to calculate the second intermediate rounding bit (STK) based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
37. The method according to any one of claims 34-36, wherein, Calculating the second intermediate data bit (ABS) based on the intermediate data bit (Mdata) includes: Determine whether the intermediate data bit (Mdata) is less than zero; If the intermediate data bit (Mdata) is less than zero, then the two's complement of the intermediate data bit (Mdata) is calculated and used as the second intermediate data bit (ABS); otherwise... The intermediate data bit (Mdata) is used as the intermediate data bit (ABS).
38. The method according to claim 37, wherein, Calculating the second intermediate data bit (ABS) based on the intermediate data bit (Mdata) further includes: Determine whether the data type of the intermediate data bit (Mdata) is type 1 or type 2; If the data type of the intermediate data bit (Mdata) is type 1, then Determine whether the intermediate data bit (Mdata) is less than zero; If the intermediate data bit (Mdata) is less than zero, then the two's complement of the intermediate data bit (Mdata) is calculated and used as the second intermediate data bit (ABS); otherwise... The intermediate data bit (Mdata) is used as the intermediate data bit (ABS). If the data type of the intermediate data bit (Mdata) is type II, then The intermediate data bits (Mdata) are normalized to serve as the second intermediate data bits (ABS).
39. The method according to claim 38, wherein, The calculation of the pre-output data bits (Pdata) and pre-output sign bits (Psign) based on the second intermediate result includes: The second intermediate rounding bit (STK) is calculated based on the second intermediate data bit (ABS), the second intermediate exponent bit (EXP), and the second intermediate sign bit (Sign).
40. The method according to claim 35, 36 or 39, wherein, The second intermediate rounding bit (STK) is calculated using AND-OR logic.
41. The method according to claim 35, 36 or 39, wherein, The calculation of the pre-output data bits (Pdata) and pre-output sign bits (Psign) based on the second intermediate result includes: The pre-output data bit (Pdata) and the pre-output sign bit (Psign) are calculated based on the second intermediate data bit (ABS), the second intermediate sign bit (Sign), the second intermediate exponent bit (EXP), and the second intermediate rounding bit (STK).
42. The method according to claim 41, wherein, The calculation of the pre-output data bits (Pdata) and pre-output sign bits (Psign) based on the second intermediate result includes: Shift the second intermediate data bit (ABS) by the second intermediate exponent bit (EXP) to obtain the shift result; A temporary data bit (ABS') is generated based on the shift result and the second intermediate rounding bit (STK); The pre-output sign bit (Psign) is equivalent to the second intermediate sign bit (Sign).
43. The method according to claim 42, wherein, Calculating the pre-output data bits (Pdata) and pre-output sign bits (Psign) based on the second intermediate result further includes: Check whether the temporary data bit (ABS') is greater than the saturation value. If it is greater than the specified value, the temporary data bit (ABS') is saturated to obtain the pre-output data bit (Pdata). If it is not greater than, the temporary data bit (ABS') is output as the pre-output data bit (Pdata).
44. The method according to claim 41, wherein, Generating the second type of data based on the pre-output data bits (Pdata) and the pre-output sign bits (Psign) includes: Receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign) to generate the output data bit representation (Data_out). The second type of data is generated based on the output data bit representation (Data_out) and the pre-output sign bit (Psign).
45. The method according to claim 44, wherein, Receiving the pre-output data bit (Pdata) and the pre-output sign bit (Psign) to generate an output data bit representation (Data_out) further includes: generating a floating-point decimal point representation (SHIFT_FP); generating a second type of data based on the output data bit representation (Data_out) and the pre-output sign bit (Psign) includes: generating the second type of data based on the data output bit representation (Data_out), the floating-point decimal point representation (Shift_FP), and the pre-output sign bit (Psign).
46. The method according to claim 44 or 45, wherein, Receiving the pre-output data bits (Pdata) and the pre-output sign bits (Psign) to generate the output data bit representation (Data_out) includes: Receive the pre-output data bit (Pdata) and the pre-output sign bit (Psign). If the pre-output sign bit (Psign) is negative, then take the two's complement of the pre-output data bit (Pdata). If the pre-output sign bit (Psign) is positive and non-negative, then output the pre-output data bit (Psign) as the data output bit representation (Data_out).
47. The method of claim 46, wherein, Receiving the pre-output data bits (Pdata) and the pre-output sign bits (Psign) to generate the output data bit representation (Data_out) further includes: The system receives the pre-output data bit (Pdata), determines whether the data type of the pre-output data bit (Pdata) is type I or type II, and if the data type of the pre-output data bit (Pdata) is type I, then... If the front output sign bit (Psign) is negative, then the front output data bit (Pdata) is two's complemented. If the front output sign bit (Psign) is positive and non-negative, then the front output data bit (Psign) is output as the data output bit representation (Data_out). If the data type of the pre-output data bit (Pdata) is type II, then The pre-output data bits (Pdata) are normalized and output as a data output bit representation (Data_out). The number of decimal places for floating-point numbers (SHIFT_FP) is determined based on the output of the normalizer.
48. The method according to any one of claims 29-31, wherein, Receiving first type data and first description information about the first type data, and converting the first type data into an intermediate result based on the first description information, further includes: The number of received first-type data is determined, and the number of first-type data is concatenated to form first concatenated data. Based on the first description information, the first concatenated data is converted into an intermediate result.
49. The method of claim 48, wherein, The number of intermediate results received is determined as follows: Divide the number of bits processed by the converter used in this invention by the number of bits of the first type of data; or The preset first fixed value.
50. The method according to any one of claims 29-31, wherein, Receiving first type data and first description information about the first type data, and converting the first type data into an intermediate result based on the first description information, further includes: Determine the number of the first type of data to be split, and split the first type of data into the number of split data. Based on the first description information, convert the split data into intermediate results.
51. The method of claim 50, wherein, The number of intermediate results to be split is determined as follows: Divide the number of bits of the first type of data by the number of bits processed by the converter used in this invention; or The second fixed value is preset.
52. The method according to any one of claims 29-31, wherein, It is further configured to receive constraint information, which indicates whether a specific standard is supported and / or whether compilation optimization is supported.
53. The method according to any one of claims 29-31, wherein, The data types of both Type I and Type II data are extensible.
54. An electronic device, comprising: One or more processors; as well as A memory storing computer-executable instructions that, when executed by the one or more processors, cause the electronic device to perform the method as described in any one of claims 29-53.
55. A computer-readable storage medium comprising computer-executable instructions that, when executed by one or more processors, perform the method as described in any one of claims 29-53.