Normalization exponential function calculation method and device for text image processing

By transforming the feature vector sequence in image processing into a fixed-point number form and performing maximum value search and translation processing, combined with an approximation lookup table and iterative algorithm, the problems of high computational complexity and numerical overflow of the normalization function are solved, achieving efficient and low-power text image processing.

CN122244876APending Publication Date: 2026-06-19GUANGZHOU ZHONO ELECTRONICS TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
GUANGZHOU ZHONO ELECTRONICS TECH CO LTD
Filing Date
2026-03-26
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, the calculation of normalization functions in image processing relies on floating-point operations, which leads to high computational complexity, long processing time, high power consumption, and a tendency for numerical overflow, affecting computational accuracy and stability.

Method used

The feature vector sequence of the text image to be processed is transformed into a fixed-point number form. Then, through maximum value search and translation processing, combined with a preset approximation lookup table and iterative algorithm, exponential approximation calculation, accumulation and reciprocal calculation are performed to avoid division operations and improve calculation accuracy and stability.

Benefits of technology

It reduces computational complexity and power consumption, avoids numerical overflow problems, improves the output accuracy and stability of the normalized exponential function, and enhances the precision and stability of text and image processing.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244876A_ABST
    Figure CN122244876A_ABST
Patent Text Reader

Abstract

This invention discloses a method for calculating a normalized exponential function for text image processing, comprising: obtaining a first feature vector sequence corresponding to the text image to be processed, and converting it into a fixed-point number form to obtain a second feature vector sequence; performing maximum value search and translation processing on the second feature vector sequence, and then performing exponential approximation calculation to obtain an exponential approximation value; accumulating the exponential approximation values ​​to obtain an exponential sum; performing reciprocal calculation on the exponential sum to obtain a reciprocal value; and determining the output value of the normalized exponential function based on the exponential approximation value and the reciprocal value. This invention, by converting the feature vector sequence into a fixed-point number form, reduces the dependence on floating-point arithmetic units during normalization calculation, avoids floating-point exponentiation and division operations, reduces computational complexity, latency, and power consumption, and improves computational accuracy. Simultaneously, the maximum value search and translation processing effectively avoids numerical overflow problems, thereby ensuring the accuracy of the normalized output result.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image processing, and in particular to a method and apparatus for calculating a normalized exponential function for text image processing. Background Technology

[0002] In the field of text and image processing, with the increase in the scale (such as large-size image feature maps) and complexity of deep learning models, normalization operations have appeared extensively in tasks such as image segmentation, and have become a key computational unit in intermediate layers such as attention mechanisms, making them increasingly important for the accuracy and stability of image processing.

[0003] In existing technologies, when processing images, normalization functions often use floating-point operations, rely on floating-point units, and frequently employ division operations. This results in high computational complexity and latency, leading to significant errors in the output normalization results. Furthermore, these functions are difficult to implement efficiently in low-power / low-cost hardware. Additionally, traditional normalization operations are prone to numerical overflow during exponential operations, resulting in a loss of computational precision. Summary of the Invention

[0004] The present invention aims to solve at least one of the technical problems existing in the prior art.

[0005] Therefore, one objective of this invention is to propose a method for calculating a normalized exponential function in text image processing. This method transforms the first feature vector sequence corresponding to the text image to be processed into a fixed-point number form, which reduces the dependence on floating-point arithmetic units during normalization calculation, avoids floating-point exponentiation and division operations, reduces computational complexity, latency, and power consumption, and improves computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, accumulation calculation of exponential approximation values, and reciprocal calculation, thus improving the accuracy and stability of the normalized exponential function output value, and thus contributing to improving the accuracy and stability of text image processing.

[0006] Therefore, a second objective of the present invention is to provide a normalized exponential function calculation device for text image processing.

[0007] Therefore, a third objective of the present invention is to provide an electronic device.

[0008] Therefore, a fourth object of the present invention is to provide a computer-readable storage medium.

[0009] To achieve the above objectives, a first aspect of the present invention provides a method for calculating a normalized exponential function for text image processing. The method includes: obtaining a first feature vector sequence corresponding to a text image to be processed; converting the first feature vector sequence into a fixed-point number form to obtain a second feature vector sequence; performing maximum value search and translation processing on the second feature vector sequence to obtain a third feature vector sequence; performing exponential approximation calculation on the third feature vector sequence to obtain an exponential approximation value; performing cumulative calculation on the exponential approximation value to obtain an exponential sum; performing reciprocal calculation on the exponential sum to obtain a reciprocal value; and determining the output of the normalized exponential function based on the exponential approximation value and the reciprocal value.

[0010] The normalized exponential function calculation method for text image processing according to embodiments of the present invention reduces the dependence on floating-point arithmetic units during normalization calculation by converting the first feature vector sequence corresponding to the text image to be processed into fixed-point number form, avoiding floating-point exponentiation and division operations, reducing computational complexity, latency, and power consumption, and improving computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation, and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus contributing to improving the accuracy and stability of text image processing.

[0011] In addition, the normalized exponential function calculation method for text image processing according to embodiments of the present invention may also have the following additional technical features: In some examples, obtaining the first feature vector sequence corresponding to the text image to be processed includes: inputting the text image to be processed into a preset network encoder to extract a multi-scale feature map corresponding to the text image to be processed; and performing a one-dimensional operation expansion based on the multi-scale feature map to obtain the first feature vector sequence.

[0012] In some examples, the step of performing maximum value search and translation on the second feature vector sequence to obtain the third feature vector sequence includes: determining the maximum value of each element in the second feature vector sequence; and subtracting the maximum value from each element in the second feature vector sequence to obtain the third feature vector sequence.

[0013] In some examples, the process of performing exponential approximation calculation on the third feature vector sequence to obtain an exponential approximation value includes: querying a preset approximation value lookup table based on each element in the third feature vector sequence to obtain the exponential approximation value, wherein different elements correspond to different preset numerical ranges, and different preset numerical ranges correspond to different preset approximation value lookup tables.

[0014] In some examples, the preset approximation lookup table includes: a preset index lookup table and a preset index coefficient lookup table; the step of querying the preset approximation lookup table based on each element in the third feature vector sequence to obtain the index approximation includes: splitting each element into an index part and an index part; querying the preset index index lookup table based on the index part to determine the base index value and the corrected index value corresponding to each element; querying the preset index coefficient lookup table based on the index part to obtain the corrected coefficient corresponding to each element; and correcting the base index value based on the corrected index value and the corrected coefficient to obtain the index approximation.

[0015] In some examples, the step of calculating the reciprocal of the exponent sum to obtain the reciprocal value includes: performing mantissa normalization on the exponent sum to determine the most significant bit of the exponent sum; querying a preset reciprocal lookup table based on the most significant bit to determine the initial reciprocal value; and correcting the initial reciprocal value based on a preset iterative algorithm to obtain the reciprocal value.

[0016] In some examples, the process of accumulating the exponential approximations to obtain the exponential sum also includes: obtaining the most significant bit of the first exponential sum of the exponential approximations participating in the accumulation calculation in real time; when the most significant bit exceeds a preset significant bit threshold, scaling the first exponential sum and the exponential approximations not participating in the accumulation based on a preset scaling factor; and determining the exponential sum based on the accumulated result of the scaled first exponential sum and the scaled exponential approximations.

[0017] To achieve the above objectives, a second aspect of the present invention provides a normalized exponential function calculation device for text image processing, the device comprising: an acquisition module, configured to acquire a first feature vector sequence corresponding to the text image to be processed; The conversion module is used to convert the first feature vector sequence into a fixed-point number form to obtain the second feature vector sequence. The processing module is used to perform maximum value search and translation processing on the second feature vector sequence to obtain a third feature vector sequence; the first calculation module is used to perform exponential approximation calculation on the third feature vector sequence to obtain an exponential approximation value; the second calculation module is used to perform cumulative calculation on the exponential approximation value to obtain an exponential sum; the third calculation module is used to perform reciprocal calculation on the exponential sum to obtain a reciprocal value; and the determination module is used to determine the output value calculated by the normalized exponential function based on the exponential approximation value and the reciprocal value.

[0018] In some examples, the acquisition module, the conversion module, the processing module, the first calculation module, the second calculation module, the third calculation module, and the determination module are configured to be cascaded in a pipeline manner; The pipeline cascading method includes: when the current level module outputs the result data of the current cycle to the next level module, the current level module receives and processes the input data of the next cycle.

[0019] The normalized exponential function calculation device for text image processing according to the present invention reduces the dependence on floating-point arithmetic units during normalization calculation by converting the first feature vector sequence corresponding to the text image to be processed into fixed-point number form, avoiding floating-point exponentiation and division operations, reducing computational complexity, latency, and power consumption, and improving computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation, and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus contributing to improving the accuracy and stability of text image processing.

[0020] To achieve the above objectives, a third aspect of the present invention provides an electronic device comprising: a normalized exponential function calculation apparatus for text image processing as described in the second aspect of the present invention, or a processor, a memory, and a normalized exponential function calculation program for text image processing stored in the memory and executable on the processor, wherein the normalized exponential function calculation program for text image processing, when executed by the processor, implements the normalized exponential function calculation method for text image processing as described in any of the first aspect of the present invention.

[0021] According to the electronic device of the present invention, by converting the first feature vector sequence corresponding to the text image to be processed into a fixed-point number form, the reliance on floating-point arithmetic units can be reduced during normalization calculation, avoiding floating-point exponentiation and division operations, reducing computational complexity, latency and power consumption, and improving computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus helping to improve the accuracy and stability of text image processing.

[0022] A further embodiment of the present invention discloses a computer-readable storage medium storing a normalized exponential function calculation program for text image processing. When executed by a processor, the normalized exponential function calculation program for text image processing implements the normalized exponential function calculation method for text image processing as described in any of the above embodiments of the present invention.

[0023] According to an embodiment of the present invention, when the normalized exponential function calculation program for text image processing stored thereon is executed by a processor, it reduces the dependence on floating-point arithmetic units during normalization calculation by converting the first feature vector sequence corresponding to the text image to be processed into fixed-point number form, avoiding floating-point exponentiation and division operations, reducing computational complexity, latency, and power consumption, and improving computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation, and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus contributing to improving the accuracy and stability of text image processing.

[0024] Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description

[0025] The above and / or additional aspects and advantages of the present invention will become apparent and readily understood from the description of the embodiments taken in conjunction with the following drawings, in which: Figure 1 This is a flowchart illustrating a method for calculating a normalized exponential function for text image processing according to an embodiment of the present invention. Figure 2 This is a schematic diagram of a normalized exponential function calculation device for text image processing according to an embodiment of the present invention.

[0026] Figure label: Normalized exponential function calculation device for text image processing - 100; acquisition module - 110; conversion module - 120; processing module - 130; first calculation module - 140; second calculation module - 150; third calculation module - 160; determination module - 170. Detailed Implementation

[0027] To provide a more detailed understanding of the features and technical content of the embodiments of the present invention, the implementation of the embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The accompanying drawings are for illustrative purposes only and are not intended to limit the embodiments of the present invention. In the following technical description, for ease of explanation, several details are used to provide a full understanding of the disclosed embodiments. However, one or more embodiments may still be implemented without these details.

[0028] The following is for reference. Figures 1-2 A method and apparatus for calculating a normalized exponential function for text image processing according to embodiments of the present invention are described.

[0029] Figure 1 This is a flowchart illustrating a method for calculating a normalized exponential function for text image processing according to an embodiment of the present invention. Figure 1 As shown, the method includes steps S1-S7: Step S1: Obtain the first feature vector sequence corresponding to the text image to be processed.

[0030] Specifically, in the process of text image processing, to facilitate operations such as text image recognition, segmentation, and erasure, normalization processing can be performed on the text image. First, in the input preprocessing stage, the vector sequence corresponding to the text image to be processed can be obtained. The acquisition method includes, but is not limited to, extracting visual features from the text image to be processed (such as printed text images, document images containing handwritten handwriting, invoice text images, etc.) through the encoder of a deep learning model (such as SegFormer, a network suitable for text image processing), forming a high-dimensional feature vector sequence, i.e., the first feature vector sequence. It is the original input when calculating the normalization exponential function and usually contains feature information such as texture, region, and contour of the text image to be processed. Step S2: Convert the first feature vector sequence into fixed-point number form to obtain the second feature vector sequence.

[0031] Specifically, after obtaining the first feature vector sequence, since the first feature vector sequence (e.g., X={x1,x2,…,xN}) is usually represented as a floating-point number (e.g., xi=2.5 or xi= When normalizing values ​​such as 0.123, they often rely on floating-point units and frequently employ division operations, resulting in high computational complexity and latency. This leads to significant errors in the output normalized results and makes efficient implementation difficult in low-power / low-cost hardware, resulting in a loss of computational precision. Therefore, the first feature vector sequence can be converted into a fixed-point form, including but not limited to Q16.16 fixed-point format (32 bits in total, with 16 bits for the integer part and 16 bits for the fractional part), thus obtaining a new feature vector sequence (i.e., the second feature vector sequence). For example, when the input value is... At 3.625, its fixed-point representation can be understood as the integer part. 3. The fractional part is 0.625. In the actual implementation, the integer part can be obtained by truncating the high bits of the fixed-point number, and the fractional part can be obtained by extracting the low 16 bits.

[0032] Step S3: Perform maximum value search and translation on the second feature vector sequence to obtain the third feature vector sequence.

[0033] Specifically, after obtaining the second eigenvector sequence in fixed-point format, the maximum value search and translation processing can be performed on the second eigenvector sequence based on the translation invariance of the normalized exponential function (i.e., without changing the final calculation result of the normalized exponential function). This includes, but is not limited to, shifting the second eigenvector sequence based on the maximum value to determine the maximum value among all elements of the second eigenvector sequence. This ensures that the input values ​​during the normalized exponential function calculation are all non-positive numbers, thereby avoiding numerical overflow or underflow problems during the calculation process. This improves the numerical stability of the normalized exponential function calculation and solves the precision loss problem caused by numerical overflow during the normalized exponential function calculation in traditional techniques.

[0034] Step S4: Perform exponential approximation calculation on the third feature vector sequence to obtain the exponential approximation value.

[0035] Specifically, after determining the third eigenvector sequence, an exponential approximation can be performed on the third eigenvector sequence. That is, the approximate result of the natural exponential function can be calculated for each element (fixed-point eigenvalue) in the third eigenvector sequence. The calculation method includes, but is not limited to, implementation by combining an exponential lookup table with fixed-point interpolation. This method does not require floating-point operations, high-order polynomial approximation, or a large number of multiplication and addition operations, which can significantly reduce the computational complexity and hardware resource consumption while ensuring the accuracy of the normalized exponential function calculation.

[0036] In a specific embodiment, to avoid the problem of accumulated value overflow, an exponential approximation (E) can also be obtained. i =e di After that, the exponential approximation is directly scaled to a uniform fixed point, for example: E i =E i >>s, where s is the preset scaling factor; ">>" indicates a fixed-point right shift operation (i.e., division by 2). s It is understandable that this operation can reduce the magnitude of the exponential approximation, ensuring that multiple exponential approximations do not exceed the representation range of the accumulator register during accumulation. Furthermore, in practical implementation, the scaling factor s can be preset based on the maximum length of the input vector of the normalized exponential function, the fixed-point data bit width, and the maximum value range of the normalized exponential function.

[0037] Step S5: Accumulate the approximate exponent values ​​to obtain the exponent sum.

[0038] Specifically, after determining the approximate value of the exponent, the approximate values ​​of the exponent can be accumulated to obtain the exponent sum. For example, the approximate values ​​of the exponent corresponding to all elements in the third feature vector sequence can be accumulated one by one. During the accumulation process, fixed-point accumulation methods such as expanding the bit width are used to avoid numerical overflow, so as to obtain the arithmetic sum of all exponent approximate values, that is, the exponent sum, so that the exponent sum can be used as the denominator in the calculation process of the normalized exponent function.

[0039] Step S6: Calculate the reciprocal of the exponent and the sum to obtain the reciprocal value.

[0040] Specifically, after determining the sum of exponents, the reciprocal of the sum can be calculated to obtain the reciprocal value. For example, for the accumulated sum of exponents, a fixed-point reciprocal calculation method (such as a reciprocal lookup table combined with Newton's iteration) can be used to calculate the reciprocal of the sum of exponents. This transforms the division operation in traditional normalization into a reciprocal operation. It is understandable that calculating the reciprocal does not involve division; therefore, complex division circuits are unnecessary, significantly reducing computational complexity, hardware resource consumption, and computational latency.

[0041] Step S7: Determine the output value of the normalized exponential function based on the exponential approximation and the reciprocal value.

[0042] Specifically, according to the definition of the normalized exponential function, its specific formula is as follows: To avoid division operations during the normalization process, it can be converted to obtain ,in, This represents the exponential approximation obtained through exponential approximation (fixed-point format, such as Q16.16 format). This represents the reciprocal (fixed-point format, such as Q16.16 format). It can be understood that the output value of the normalized exponential function can be determined by multiplying the exponential approximation by the reciprocal.

[0043] In a specific embodiment, the output value of the normalized exponential function can be directly used as the attention weight of the preset network decoder to guide the erasure operation of the handwritten handwriting area in the text image to be processed, thereby accurately identifying and erasing the handwritten handwriting while preserving the printed text in the image.

[0044] Therefore, the above-described method for calculating the normalized exponential function for text image processing, by converting the first feature vector sequence corresponding to the text image to be processed into a fixed-point number form, can reduce the dependence on floating-point arithmetic units during normalization calculation, avoid floating-point exponentiation and division operations, reduce computational complexity, latency and power consumption, and improve computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, the numerical overflow problem can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, which in turn helps to improve the accuracy and stability of text image processing.

[0045] In one embodiment of the present invention, obtaining the first feature vector sequence corresponding to the text image to be processed includes: inputting the text image to be processed into a preset network encoder to extract the multi-scale feature map corresponding to the text image to be processed; and performing one-dimensional operation expansion based on the multi-scale feature map to obtain the first feature vector sequence.

[0046] Specifically, in the process of obtaining the first feature vector sequence corresponding to the text image to be processed, the text image to be processed (usually a grayscale image or a binary image) can be input into a preset network encoder. In the network encoder, multi-scale features corresponding to the text image to be processed are extracted through multiple convolutions, that is, feature maps of different resolutions output by the network encoder at different levels. Furthermore, one-dimensional operations can be performed on the multi-scale feature maps (usually two-dimensional feature maps) to obtain a one-dimensional vector sequence representing the pixel positions in the feature maps, that is, the first feature vector sequence.

[0047] In one embodiment of the present invention, a third feature vector sequence is obtained by performing a maximum value search and translation process on the second feature vector sequence, including: determining the maximum value of each element in the second feature vector sequence; and subtracting the maximum value from each element in the second feature vector sequence to obtain the third feature vector sequence.

[0048] Specifically, when performing maximum value search and shifting on the second feature vector sequence, the entire sequence can be traversed, and the maximum value of each element can be found through multi-level pairwise comparisons. Furthermore, each element in the second feature vector sequence can be subtracted from its maximum value, i.e., a shift operation can be performed, resulting in a new sequence (the third feature vector sequence) where all elements are non-positive. It is understandable that in the second feature vector sequence, all elements are less than or equal to the maximum value, so subtracting the maximum value results in non-positive numbers. This allows the input to the normalization exponential function to be restricted to the non-positive interval without changing the final calculation result, fundamentally avoiding numerical overflow and significantly improving the stability and accuracy of normalization calculations in text image processing.

[0049] In one embodiment of the present invention, the exponential approximation calculation of the third feature vector sequence to obtain the exponential approximation value includes: querying a preset approximation value lookup table based on each element in the third feature vector sequence to obtain the exponential approximation value, wherein different elements correspond to different preset numerical ranges, and different preset numerical ranges correspond to different preset approximation value lookup tables.

[0050] Specifically, in the process of performing exponential approximation calculation on the third eigenvector sequence, each element in the third eigenvector sequence can be divided into different preset numerical ranges according to its value, and each preset numerical range corresponds to an independent preset exponential approximation lookup table. Furthermore, based on the numerical range to which each element belongs, the corresponding exponential approximation lookup table can be queried to obtain the exponential approximation value for each element. It is understandable that by using a piecewise lookup table to implement exponential approximation calculation, the size of a single lookup table can be reduced, storage resource consumption can be decreased, and floating-point operations, high-order polynomial approximation, and a large number of multiplication and addition operations are eliminated. Therefore, while ensuring the accuracy of the normalized exponential function calculation, hardware resource consumption and computational complexity can be significantly reduced, computational efficiency and throughput can be improved, effectively solving the problems of complex hardware implementation and accuracy loss in traditional normalized exponential function calculations.

[0051] In a specific embodiment, a preset approximation lookup table covers a certain range of exponent input intervals, for example [ The system performs discrete sampling according to a fixed step size, and each entry in the preset approximation lookup table stores the function value of the exponential function at the corresponding sampling point in advance, i.e.: EXP_LUT[k] = exp(xk), where xk is the exponential input value corresponding to the kth sampling point. The values ​​of all preset approximation lookup tables are calculated during system initialization or offline phase and converted into fixed-point format and stored in memory.

[0052] In one embodiment of the present invention, the preset approximation lookup table includes: a preset exponential index lookup table and a preset exponential coefficient lookup table; querying the preset approximation lookup table based on each element in the third feature vector sequence to obtain the exponential approximation includes: splitting each element into an index part and an exponential part; The index portion is used to query the preset index index lookup table to determine the base index value and the corrected index value for each element; the index portion is used to query the preset index coefficient lookup table to obtain the corrected coefficient for each element; the base index value is corrected based on the corrected index value and the corrected coefficient to obtain an approximate index value.

[0053] Specifically, the preset approximation lookup table includes a preset exponent index lookup table and a preset exponent coefficient lookup table, meaning that a double lookup table structure can be used to determine the exponent approximation value. For example, firstly, each element in the third feature vector sequence can be split into a high-order index part and a low-order exponent part according to binary bits. The high-order index part is used for the lookup table index, and the low-order exponent part is used to correct the value after the lookup index.

[0054] Furthermore, a preset index lookup table can be queried based on the index portion to determine the base index value and the corrected index value corresponding to each element, and a preset index coefficient lookup table can be queried based on the index portion to obtain the corrected coefficient corresponding to each element. Thus, the base index value can be corrected based on the corrected index value and the corrected coefficient to obtain an approximate index value.

[0055] In a specific embodiment, for any input value di in the third feature vector sequence, it is first split into an index part k and an exponential part (i.e., an interpolation part) to determine the position of the reference sampling point using the integer part, thereby determining the reference exponential value (E0=EXP_LUT[k]) and the corrected exponential value (E1=EXP_LUT[k+1]). Further, the reference exponential value can be corrected using interpolation, with the specific correction formula being: Exp(di) ≈ E0+α×(E1) E0), where α is the interpolation coefficient obtained from the decimal part of the input, and its value ranges from 0 to 1; further, it can be understood that this correction formula means that when the input value is between two sampling points, a more approximate result closer to the true exponential function can be obtained by weighting the exponential values ​​of the two sampling points, and the interpolation coefficient α is also expressed in a fixed-point format, such as the Q16.16 format.

[0056] In summary, the entire calculation process can be broken down into the following basic steps: 1. Calculate the difference ΔE = E1 E0; 2. Calculate the interpolation product M = α × ΔE; 3. Calculate the final exponential approximation Exp(di) = E0 + M. As can be seen above, the calculation process only includes subtraction, multiplication, and addition operations. Subtraction can be implemented using an adder. Therefore, the core calculation only involves multiplication and addition operations, without needing to perform complex division or floating-point operations, resulting in high hardware implementation efficiency.

[0057] In one embodiment of the present invention, calculating the reciprocal of the exponent sum to obtain the reciprocal value includes: performing mantissa normalization on the exponent sum to determine the most significant bit of the exponent sum; The initial reciprocal value is determined by querying a pre-defined reciprocal lookup table based on the most significant bit. The initial reciprocal value is corrected based on a preset iterative algorithm to obtain the reciprocal value.

[0058] Specifically, during the reciprocal calculation of the exponent sum, the accumulated exponent sum can be normalized by mantissa normalization, including but not limited to normalizing the exponent sum to a fixed, smaller interval through data shifting (e.g., ...). ), and determine its most significant bit, that is, the position of the first non-zero bit.

[0059] Furthermore, a pre-defined reciprocal lookup table can be queried based on the most significant bit, including using the most significant bit as an index to query the pre-defined reciprocal lookup table to determine the initial reciprocal value.

[0060] Furthermore, the initial reciprocal value can be corrected based on a preset iterative algorithm, including but not limited to correcting the initial reciprocal value by using two rounds of Newton iteration, so that the initial reciprocal value approximates the true reciprocal value, thereby obtaining the reciprocal value.

[0061] It is understandable that by reducing the size of the lookup table through normalization, improving the calculation speed through table lookup, and ensuring the calculation accuracy through iteration, division operations can be eliminated, thereby significantly reducing hardware resource consumption and calculation latency.

[0062] In one embodiment of the present invention, the process of accumulating the exponential approximation values ​​to obtain the exponential sum further includes: real-time acquisition of the most significant bit of the first exponential sum of the exponential approximation values ​​participating in the accumulation calculation during the accumulation calculation process. When the most significant bit exceeds the preset significant bit threshold, the first exponent sum and the exponent approximation that are not included in the accumulation are scaled based on the preset scaling factor; the exponent sum is determined based on the sum of the scaled first exponent sum and the scaled exponent approximation.

[0063] Specifically, during the process of accumulating the exponent approximations to obtain the exponent sum, a dynamic overflow scaling protection mechanism is also set up to avoid the problem of overflow of the accumulated value. For example, firstly, the most significant bit of the first exponent sum of the exponent approximations involved in the accumulation calculation can be obtained in real time, that is, after each accumulation is completed, the most significant bit of the current accumulated value is checked.

[0064] Furthermore, if the highest significant bit detected exceeds the preset significant bit threshold, it can be determined that the current first exponent sum has a risk of numerical overflow. At this time, the first exponent sum can be scaled based on the preset scaling factor. At the same time, in order to ensure that all data participating in the accumulation are at the same scale, the exponent approximations that do not participate in the accumulation can be synchronously scaled. Thus, the exponent sum is determined based on the accumulated result of the scaled first exponent sum and the scaled exponent approximations.

[0065] In specific embodiments, an extended bit-width accumulation method can also be used when accumulating exponent approximation values. For example, if the module outputting the exponent approximation value has an output bit width of 16 bits, the accumulation register bit width can be extended to 32 bits or 40 bits. For instance, the exponent value bit width is 16 bits, the accumulation register bit width is 32 bits, and the accumulation calculation process is: S = S + Ei′, where S is the extended bit-width accumulation register, and Ei′ is sign-extended or zero-extended before entering the accumulation register. It can be understood that by using an extended bit-width accumulation method, even if the number of values ​​to be accumulated is large, the problem of numerical overflow during the accumulation process can still be avoided.

[0066] In summary, the normalized exponential function calculation method for text image processing according to embodiments of the present invention, by converting the first feature vector sequence corresponding to the text image to be processed into fixed-point number form, can reduce the dependence on floating-point arithmetic units during normalization calculation, avoid floating-point exponentiation and division operations, reduce computational complexity, latency and power consumption, and improve computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus helping to improve the accuracy and stability of text image processing.

[0067] A further embodiment of the present invention provides a normalized exponential function calculation device 100 for text image processing, such as... Figure 2As shown, the normalized exponential function calculation device 100 for text image processing includes: an acquisition module 110, a conversion module 120, a processing module 130, a first calculation module 140, a second calculation module 150, a third calculation module 160, and a determination module 170.

[0068] Specifically, the acquisition module 110 is used to acquire the first feature vector sequence corresponding to the text image to be processed; The conversion module 120 is used to convert the first feature vector sequence into a fixed-point number form to obtain the second feature vector sequence; Processing module 130 is used to perform maximum value search and translation processing on the second feature vector sequence to obtain the third feature vector sequence; The first calculation module 140 is used to perform exponential approximation calculation on the third feature vector sequence to obtain an exponential approximation value; The second calculation module 150 is used to accumulate the exponential approximations to obtain the exponential sum; The third calculation module 160 is used to perform reciprocal calculations on the exponent and the sum to obtain the reciprocal value; The determination module 170 is used to determine the output value of the normalized exponential function calculation based on the exponential approximation and the reciprocal value.

[0069] In some embodiments, the acquisition module 110, conversion module 120, processing module 130, first calculation module 140, second calculation module 150, third calculation module 160, and determination module 170 are configured to be cascaded in a pipeline manner. Pipeline cascading includes: when the current level module outputs the result data of the current cycle to the next level module, the current level module receives and processes the input data of the next cycle.

[0070] Specifically, in the normalized exponential function calculation process, the acquisition module 110, transformation module 120, processing module 130, first calculation module 140, second calculation module 150, third calculation module 160, and determination module 170 are cascaded in a pipeline manner. That is, when the current level module outputs the result data of the current period to the next level module, the current level module can quickly receive and process the input data of the next period. For example, after the acquisition module 110 acquires the first feature vector sequence corresponding to the text image to be processed in the current period, it can send the first feature vector sequence to the transformation module 120 for processing. At the same time, the acquisition module 110 can quickly receive the text image to be processed in the next period for processing. This can significantly improve the throughput of the normalized exponential function operation and reduce the single calculation latency. Moreover, the module combination formed by this pipelined cascading method is more suitable for integration into neural network inference accelerators or dedicated computing chips, and has strong versatility.

[0071] In some embodiments, when obtaining the first feature vector sequence corresponding to the text image to be processed, the acquisition module 110 is specifically used to: input the text image to be processed into a preset network encoder to extract the multi-scale feature map corresponding to the text image to be processed; and perform one-dimensional operation expansion based on the multi-scale feature map to obtain the first feature vector sequence.

[0072] In some embodiments, when the second feature vector sequence is subjected to maximum value search and translation processing to obtain the third feature vector sequence, the processing module 120 is used to: query a preset approximation lookup table based on each element in the third feature vector sequence to obtain an exponential approximation value, wherein different elements correspond to different preset numerical ranges, and different preset numerical ranges correspond to different preset approximation lookup tables.

[0073] In some embodiments, when performing exponential approximation calculation on the third feature vector sequence to obtain an exponential approximation value, the first calculation module 140 is used to: query a preset approximation value lookup table based on each element in the third feature vector sequence to obtain the exponential approximation value, wherein different elements correspond to different preset numerical ranges, and different preset numerical ranges correspond to different preset approximation value lookup tables.

[0074] In some embodiments, the preset approximation lookup table includes a preset index lookup table and a preset index coefficient lookup table. When querying the preset approximation lookup table based on each element in the third feature vector sequence to obtain an index approximation, the first calculation module 140 is configured to: split each element into an index part and an index part; query the preset index index lookup table based on the index part to determine the base index value and the corrected index value corresponding to each element; query the preset index coefficient lookup table based on the index part to obtain the corrected coefficient corresponding to each element; and correct the base index value based on the corrected index value and the corrected coefficient to obtain an index approximation.

[0075] In some embodiments, when performing reciprocal calculation on the exponent sum to obtain the reciprocal value, the third calculation module 160 is used to: perform mantissa normalization on the exponent sum to determine the most significant bit of the exponent sum; query a preset reciprocal lookup table based on the most significant bit to determine the initial reciprocal value; and correct the initial reciprocal value based on a preset iterative algorithm to obtain the reciprocal value.

[0076] In some embodiments, during the process of accumulating the exponential approximations to obtain the exponential sum, the second calculation module 150 is further configured to: obtain in real time the most significant bit of the first exponential sum of the exponential approximations participating in the accumulation calculation; when the most significant bit exceeds a preset significant bit threshold, scale the first exponential sum and the exponential approximations not participating in the accumulation based on a preset scaling factor; and determine the exponential sum based on the accumulated result of the scaled first exponential sum and the scaled exponential approximations.

[0077] The normalized exponential function calculation device 100 for text image processing according to the present invention reduces the dependence on floating-point arithmetic units during normalization calculation by converting the first feature vector sequence corresponding to the text image to be processed into fixed-point number form, avoiding floating-point exponentiation and division operations, reducing computational complexity, latency and power consumption, and improving computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus helping to improve the accuracy and stability of text image processing.

[0078] Further embodiments of the present invention disclose an electronic device, comprising: a normalized exponential function calculation device for text image processing as described in the second aspect embodiment above, or a processor, a memory, and a normalized exponential function calculation program for text image processing stored in the memory and executable on the processor, wherein the normalized exponential function calculation program for text image processing, when executed by the processor, implements the normalized exponential function calculation method for text image processing as described in any of the first aspect embodiments above.

[0079] According to the electronic device of the present invention, by converting the first feature vector sequence corresponding to the text image to be processed into a fixed-point number form, the reliance on floating-point arithmetic units can be reduced during normalization calculation, avoiding floating-point exponentiation and division operations, reducing computational complexity, latency and power consumption, and improving computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus helping to improve the accuracy and stability of text image processing.

[0080] A further embodiment of the present invention discloses a computer-readable storage medium storing a normalized exponential function calculation program for text image processing. When executed by a processor, the normalized exponential function calculation program for text image processing implements the normalized exponential function calculation method for text image processing as described in any of the above embodiments of the present invention.

[0081] According to an embodiment of the present invention, when the normalized exponential function calculation program for text image processing stored thereon is executed by a processor, it reduces the dependence on floating-point arithmetic units during normalization calculation by converting the first feature vector sequence corresponding to the text image to be processed into fixed-point number form, avoiding floating-point exponentiation and division operations, reducing computational complexity, latency, and power consumption, and improving computational accuracy. At the same time, by performing maximum value search and translation processing on the second feature vector sequence, numerical overflow problems can be effectively avoided, thereby ensuring the accuracy of exponential approximation calculation, exponential approximation accumulation calculation, and reciprocal calculation, thereby improving the accuracy and stability of the normalized exponential function output value, and thus contributing to improving the accuracy and stability of text image processing.

[0082] In the description of this specification, references to terms such as "one embodiment," "some embodiments," "illustrative embodiment," "example," "specific example," or "some examples," etc., refer to specific features, structures, materials, or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example.

[0083] Although embodiments of the invention have been shown and described, those skilled in the art will understand that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A method for calculating a normalized exponential function for text image processing, characterized in that, include: Obtain the first feature vector sequence corresponding to the text image to be processed; The first feature vector sequence is converted into a fixed-point number form to obtain the second feature vector sequence; The second feature vector sequence is subjected to maximum value search and translation to obtain the third feature vector sequence; The third feature vector sequence is subjected to exponential approximation calculation to obtain an exponential approximation value; The exponential approximations are summed to obtain the exponential sum; The reciprocal of the exponent is calculated to obtain the reciprocal value; The output value of the normalized exponential function is determined based on the exponential approximation and the reciprocal value.

2. The method for calculating a normalized exponential function for text image processing according to claim 1, wherein, The step of obtaining the first feature vector sequence corresponding to the text image to be processed includes: The text image to be processed is input into a preset network encoder to extract the multi-scale feature map corresponding to the text image to be processed; The first feature vector sequence is obtained by performing a one-dimensional operation expansion based on the multi-scale feature map.

3. The method for calculating a normalized exponential function for text image processing according to claim 1, wherein, The step of performing maximum value search and translation processing on the second feature vector sequence to obtain the third feature vector sequence includes: Determine the maximum value of each element in the second feature vector sequence; The third feature vector sequence is obtained by subtracting the maximum value from each element in the second feature vector sequence.

4. The method for calculating a normalized exponential function for text image processing according to claim 1, wherein, The step of performing an exponential approximation calculation on the third feature vector sequence to obtain an exponential approximation value includes: The index approximation is obtained by querying a preset approximation lookup table based on each element in the third feature vector sequence, wherein different elements correspond to different preset numerical ranges, and different preset numerical ranges correspond to different preset approximation lookup tables.

5. The method for calculating a normalized exponential function for text image processing according to claim 4, wherein, The preset approximation value lookup table includes: a preset index lookup table and a preset index coefficient lookup table; The step of querying a preset approximation lookup table based on each element in the third feature vector sequence to obtain the exponential approximation includes: Each element is split into an index part and an exponent part; Based on the index portion, the preset index lookup table is queried to determine the base index value and the correction index value corresponding to each element; Based on the index portion, the preset index coefficient lookup table is queried to obtain the correction coefficient corresponding to each element; The benchmark index value is corrected based on the corrected index value and the corrected coefficient to obtain the approximate index value.

6. The method for normalized exponential function computation for text image processing according to claim 1, wherein, The step of calculating the reciprocal of the exponent and the sum to obtain the reciprocal value includes: The exponent and sum are subjected to mantissa normalization to determine the most significant bit of the exponent and sum; The initial reciprocal value is determined by querying a preset reciprocal lookup table based on the most significant bit. The initial reciprocal value is corrected based on a preset iterative algorithm to obtain the reciprocal value.

7. The method for normalized exponential function calculation for text image processing according to claim 1, wherein, The process of accumulating the approximate exponent values ​​to obtain the exponent sum also includes: Real-time acquisition of the most significant bit of the first exponent sum of the exponent approximations involved in the accumulation calculation process; When the most significant bit exceeds a preset significant bit threshold, the first exponent and the approximate exponent values ​​that are not included in the accumulation are scaled based on a preset scaling factor. The exponent sum is determined based on the sum of the scaled first exponent sum and the sum of the scaled exponent approximations.

8. A normalized exponential function computing apparatus for text image processing, characterized by comprising: The device includes: The acquisition module is used to acquire the first feature vector sequence corresponding to the text image to be processed; The conversion module is used to convert the first feature vector sequence into a fixed-point number form to obtain the second feature vector sequence. The processing module is used to perform maximum value search and translation processing on the second feature vector sequence to obtain the third feature vector sequence; The first calculation module is used to perform exponential approximation calculation on the third feature vector sequence to obtain an exponential approximation value; The second calculation module is used to accumulate the approximate values ​​of the exponent to obtain the exponent sum; The third calculation module is used to perform reciprocal calculations on the exponent and obtain the reciprocal value; The determination module is used to determine the output value calculated by the normalized exponential function based on the exponential approximation and the reciprocal value.

9. The normalized exponential function computing apparatus for text image processing according to claim 8, wherein, The acquisition module, the conversion module, the processing module, the first calculation module, the second calculation module, the third calculation module, and the determination module are configured to be cascaded in a pipeline manner; The pipeline cascading method includes: when the current level module outputs the result data of the current cycle to the next level module, the current level module receives and processes the input data of the next cycle.

10. An electronic device, comprising: include: The normalized exponential function calculation apparatus for text image processing as described in any one of claims 8 or 9; or, The processor, the memory, and a normalized exponential function calculation program for text image processing stored in the memory and executable on the processor, wherein the normalized exponential function calculation program for text image processing, when executed by the processor, implements the normalized exponential function calculation method for text image processing as described in any one of claims 1-7.