Image processing method and device, electronic equipment and readable storage medium
By using average pooling and compensation operators in the neural network model for hardware acceleration, the problem of low processor efficiency caused by the lack of hardware acceleration for the average operator is solved, and faster image processing speed is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ARM TECH CHINA CO LTD
- Filing Date
- 2023-03-27
- Publication Date
- 2026-06-30
AI Technical Summary
In existing technologies, the average operator in neural network models suffers from slow image processing speed due to the lack of hardware acceleration, and cannot effectively improve the processor's operating efficiency.
The average operator is replaced by an average pooling operator and a compensation operator. Hardware acceleration is achieved through multiple parallel processing units. The average pooling operator is used to calculate the mean, and the compensation operator is used to perform numerical compensation to ensure consistent results.
It significantly improves the inference speed of neural network models and achieves faster image processing speed through hardware acceleration.
Smart Images

Figure CN116402101B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, specifically to an image processing method, apparatus, electronic device, and computer-readable storage medium. Background Technology
[0002] Artificial intelligence (AI) technology is being applied more and more widely in people's production and daily life, for example, it can be applied to image recognition and processing. In the process of image recognition, the image can be input into a neural network model to obtain the corresponding image feature data, and then the image feature data can be used for cumulative calculation, mean calculation, cumulative multiplication calculation, etc., to obtain the required target features, and then subsequent recognition processing can be performed based on the target features.
[0003] The averaging operator is a function used in neural network models to calculate the average of input feature data along a specified dimension during image processing. Essentially, when a neural network model includes an averaging operator, it can quantize the feature map data and then calculate the average of the quantized data. However, this averaging process generates a large amount of data flow, resulting in significant data transfer on the processor running the neural network and slowing down image processing. Therefore, ensuring sufficient image processing speed for the processor while running a neural network incorporating an averaging operator is a crucial problem that needs to be addressed. Summary of the Invention
[0004] This application provides an image processing method, apparatus, electronic device, and computer-readable storage medium, which solves the technical problem that the lack of hardware acceleration in the averaging operator leads to low inference speed in neural network models.
[0005] Firstly, some embodiments of this application provide an image processing method applied to an electronic device running a neural network model, and the electronic device includes multiple processing units. The method includes: acquiring an image to be processed; the neural network model acquiring a feature map during the processing of the image to be processed; the neural network model detecting the need for averaging operator calculation during the processing of the feature map; and invoking an average pooling operator and a compensation operator to perform averaging operator operations to obtain a target result. The invocation of the average pooling operator and the compensation operator to perform averaging operator operations includes: performing the mean calculation of the average pooling operator through multiple parallel processing units. Thus, the algorithmic function of the averaging operator is implemented through an average pooling operator capable of hardware-accelerated computation, thereby effectively improving the inference speed of the neural network model including the averaging operator.
[0006] It is understandable that during the image processing process, neural network models can calculate feature maps to represent the features of the image, facilitating subsequent processing. For example, using feature maps to represent features in image data makes it easier to compare similarity with other images later.
[0007] For example, if it is detected that the feature map needs to be calculated using the averaging operator, the average pooling operator and the compensation operator can be invoked to execute the operation logic of the averaging operator, obtaining the target result consistent with the averaging operator. Since the mean calculation of the average pooling operator can be performed in parallel by invoking multiple hardware processing units that execute in parallel, compared to the averaging operator which cannot invoke multiple hardware processing units that execute in parallel, the average pooling operator combined with the compensation operator can effectively accelerate the processor's inference speed for neural network models.
[0008] In some possible implementations of the first aspect above, the above-mentioned invocation of the average pooling operator and the compensation operator to perform averaging operations to obtain the target result includes: invoking the average pooling operator to determine the fixed-point data and quantization parameters of the feature map based on the feature map, wherein the quantization parameters are used to indicate the mapping relationship between the floating-point data and the fixed-point data of the feature map; invoking the average pooling operator to determine the corresponding mean result based on the fixed-point data of the feature map; and invoking the compensation operator to perform numerical compensation processing on the mean result based on the quantization parameters to obtain the target result.
[0009] For example, the average pooling operator described above can map the feature map from the floating-point domain to the fixed-point domain, converting the floating-point data of the feature map into fixed-point data. During this process, the average pooling operator can obtain quantization parameters, such as the zero-point offset of the input feature map data, the linear mapping coefficients representing the scaling ratio, and the zero-point offset of the output feature map data. However, the average pooling operator does not use the quantization parameters for mean calculation. Instead, a compensation operator obtains the quantization parameters from the average pooling operator and performs numerical compensation on the mean result calculated by the average pooling operator based on the quantization parameters. This ensures that combining the average pooling operator and the compensation operator directly yields a calculation result consistent with the average operator. Since the average pooling operator can invoke parallel processing units, it can effectively improve the processor's inference speed for neural network models compared to the average operator, which cannot invoke parallel processing units.
[0010] In some possible implementations of the first aspect above, the above-mentioned invocation of the average pooling operator to determine the feature map fixed-point data and quantization parameters based on the feature map includes: invoking the average pooling operator to group the feature map fixed-point data based on preset conditions to obtain grouped feature map fixed-point data; and performing mean calculation based on the grouped feature map fixed-point data to determine the corresponding mean result.
[0011] Understandably, to reduce computational load, the aforementioned average pooling operator can be used to group the feature map point data based on preset conditions, resulting in grouped feature map point data. For example, the feature map point data can be grouped based on a specified dimension, ensuring that the mean calculation only applies to feature map point data within that specified dimension. Then, the aforementioned average pooling operator performs mean calculation based on the grouped feature map point data to determine the corresponding mean result. For example... Where n is the number of fixed-point data in the feature map, and xiq is the number of fixed-point data in the feature map.
[0012] In some possible implementations of the first aspect above, the mean calculation of the average pooling operator includes addition and division calculations. Furthermore, the mean calculation based on the grouped feature map fixed-point data to determine the corresponding mean result includes: performing addition calculations on the grouped feature map fixed-point data through multiple parallel first processing units to determine the corresponding first result; and performing division calculations on the grouped feature map fixed-point data through multiple parallel second processing units to determine the corresponding mean result.
[0013] It's understandable that to calculate the mean, all the fixed-point data from the feature maps in each group can be added together, and then divided by the number of fixed-point data from the feature maps in that group to obtain the mean. Therefore, the mean calculation of the average pooling operator mentioned above includes both addition and division calculations, and both addition and division calculations can call the corresponding parallel execution hardware processing units to achieve batch calculation of addition and division, i.e., batch calculation of the mean. Therefore, compared to average operators that cannot call parallel execution processing units, the average pooling operator can effectively improve the processor's inference speed for neural network models.
[0014] In some possible implementations of the first aspect above, the above-mentioned invocation of the average pooling operator to group the feature map fixed-point data based on preset conditions to obtain grouped feature map fixed-point data includes: invoking the average pooling operator to group the feature map fixed-point data based on a specified dimension to obtain grouped feature map fixed-point data.
[0015] It is understandable that the average pooling operator can determine the range of feature map data for which mean calculation needs to be performed based on a specified dimension of the input, thereby achieving the desired mean calculation purpose. For example, when the specified dimension is one of the dimensions in a multidimensional matrix, since the feature map data within the specified dimension range corresponds to only one mean result, the mean calculation of the specified dimension can reduce the dimensionality of the feature map data by the specified dimension.
[0016] It is understood that the calculation process for the specified dimensions will be explained in detail below, and will not be repeated here.
[0017] In some possible implementations of the first aspect described above, the compensation operator includes an accumulation operator, and the step of calling the compensation operator to perform numerical compensation processing on the mean result according to the quantization parameters to obtain the target result includes: calling the accumulation operator to obtain a preset tensor from a preset storage location, wherein the value in the preset tensor is a constant 0; performing an accumulation calculation on the preset tensor and the mean result to obtain an accumulation result consistent with the mean result; and performing numerical compensation processing on the accumulation result according to the quantization parameters to obtain the target result.
[0018] It is understood that in some embodiments, the compensation operator can be an accumulation operator, such as Eltwise Add. In the Eltwise Add operator, the input consists of two weight tensors, i.e., two high-dimensional matrices. The dimensions remain unchanged after adding the two weight tensors; only the values are added one by one, increasing the information content in each dimension. Since the inherent computational logic of the Eltwise Add operator requires two inputs, a preset tensor with a constant value of 0 can be stored in a preset storage location to prevent the inherent computational logic of the Eltwise Add operator from affecting the value and dimension of the mean result. Furthermore, the Eltwise Add operator can be used to compensate for the quantization parameters in the mean result, ensuring that the calculation result of the average pooling operator is consistent with the calculation result of the average operator. This allows the average pooling operator combined with the accumulation operator to completely replace the computational logic of the average operator.
[0019] Furthermore, since the average pooling operator can invoke parallel processing units, it can effectively improve the processor's inference speed for neural network models compared to the average operator, which cannot invoke parallel processing units.
[0020] In some possible implementations of the first aspect above, the above-mentioned call to the accumulation operator to accumulate the preset tensor and the mean result includes: performing the accumulation calculation of the preset tensor and the mean result through multiple parallel execution third processing units.
[0021] It is understandable that the aforementioned accumulation operator can also be hardware-accelerated, meaning it can invoke the parallel processing unit corresponding to the accumulation operator to perform the accumulation calculation of the preset tensor and the mean result. Therefore, using the average pooling operator, which is hardware-accelerated, in combination with the accumulation operator to calculate feature map data is still faster than using the average operator, which cannot invoke the parallel processing unit. This effectively improves the processor's inference speed for neural network models.
[0022] In some possible implementations of the first aspect described above, the quantization parameters include the zero-point offset of the mean result, the linear mapping coefficient, and the zero-point offset of the fixed-point data of the output feature map. Furthermore, the numerical compensation processing of the accumulated result based on the quantization parameters to obtain the target result includes: adding the accumulated result and the zero-point offset of the mean result to obtain a first compensation result; multiplying the first compensation result and the linear mapping coefficient to obtain a second compensation result; and subtracting the zero-point offset of the fixed-point data of the output feature map from the second compensation result to obtain the target result.
[0023] It is understandable that, compared to the average operator, the average pooling operator only lacks the following three quantization parameters:
[0024] (1) Zero offset of the mean result;
[0025] (2) Linear mapping coefficients;
[0026] (3) Output the zero offset of the fixed-point data of the feature map.
[0027] Therefore, the three quantization parameters mentioned above can be compensated into the mean result using the accumulation operator. Since the accumulation result is consistent with the mean result, the zero-point offset of the mean result can be added to the accumulation result, then multiplied by the linear mapping coefficient, and finally the zero-point offset of the fixed-point data of the output feature map can be subtracted to obtain the target result consistent with the calculation result of the averaging operator. Furthermore, by combining the average pooling operator and the accumulation operator, the calculation logic of the averaging operator is completely replaced.
[0028] In some possible implementations of the first aspect above, the compensation operator includes a product operator, and the above-mentioned invoking the compensation operator to perform numerical compensation processing on the mean result according to the quantization parameters to obtain the target result includes: invoking the product operator to perform product calculation on the mean result according to a preset linear component to obtain a product result consistent with the mean result, wherein the value in the preset linear component is a constant 1; and performing numerical compensation processing on the product result according to the quantization parameters to obtain the target result.
[0029] It is understood that in some embodiments, the compensation operator can be a product operator, such as an operator containing the LeakyReLU activation function. Since the inherent computational logic of the product operator requires multiplying the negative number in the input by a preset linear component, the value of the preset linear component is set to a constant 1, thereby preventing the inherent computational logic of the Leaky ReLU activation function from affecting the value and dimension of the mean result. Furthermore, the product operator can be used to compensate for the quantization parameters in the mean result, ensuring that the calculation result of the average pooling operator is consistent with the calculation result of the average operator. This allows the average pooling operator combined with the accumulation operator to completely replace the computational logic of the average operator.
[0030] Furthermore, since the product operator does not need to retrieve data from a preset storage path, it avoids the waiting time that the accumulation operator experiences when retrieving data. Therefore, compared to the average pooling operator combined with the accumulation operator mentioned above, the average pooling operator combined with the product operator can complete data compensation processing faster, further improving the processor's inference speed for neural network models.
[0031] In some possible implementations of the first aspect above, the above-mentioned invocation of the product operator to perform product calculation based on the preset linear component and the mean result includes: performing the product calculation of the preset linear component and the mean result through multiple parallel fourth processing units.
[0032] It is understandable that the aforementioned product operator can also be hardware-accelerated, meaning it can invoke the parallel processing unit corresponding to the product operator to perform the product calculation of the preset linear component and the mean result. Therefore, using the average pooling operator, which is hardware-accelerated, in combination with the product operator to calculate feature map data is still faster than using the average operator, which cannot invoke the parallel processing unit. This effectively improves the processor's inference speed for neural network models.
[0033] In some possible implementations of the first aspect described above, the quantization parameters include the zero-point offset of the mean result, the linear mapping coefficient, and the zero-point offset of the fixed-point data of the output feature map. Furthermore, the numerical compensation processing of the product result based on the quantization parameters to obtain the target result includes: adding the product result and the zero-point offset of the mean result to obtain a third compensation result; multiplying the third compensation result and the linear mapping coefficient to obtain a fourth compensation result; and subtracting the zero-point offset of the fixed-point data of the output feature map from the fourth compensation result to obtain the target result.
[0034] It is understandable that, compared to the average operator, the average pooling operator only lacks the following three quantization parameters:
[0035] (1) Zero offset of the mean result;
[0036] (2) Linear mapping coefficients;
[0037] (3) Output the zero offset of the fixed-point data of the feature map.
[0038] Therefore, the three quantization parameters mentioned above can be compensated into the mean result using the product operator. Since the product result is consistent with the mean result, the zero-point offset of the mean result can be added to the product result, then multiplied by the linear mapping coefficient, and finally the zero-point offset of the fixed-point data of the output feature map can be subtracted to obtain the target result consistent with the calculation result of the averaging operator. Furthermore, by combining the average pooling operator and the product operator, the calculation logic of the averaging operator is completely replaced.
[0039] Secondly, some embodiments of this application also provide an image processing apparatus applied to an electronic device running a neural network model. The electronic device includes multiple processing units, and the apparatus includes: an image acquisition module for acquiring an image to be processed; a data processing module for acquiring feature maps during the processing of the image to be processed by the neural network model; an operator detection module for detecting the need for averaging operator calculation during the processing of the feature maps by the neural network model; and a computation module for calling an average pooling operator and a compensation operator to perform averaging operator calculations to obtain a target result. The calling of the average pooling operator and the compensation operator to perform averaging operator calculations includes: performing the mean calculation of the average pooling operator through multiple parallel processing units. Thus, by implementing the algorithmic function of the averaging operator through an average pooling operator capable of hardware-accelerated computation, the inference speed of the neural network model including the averaging operator can be effectively improved.
[0040] Thirdly, some embodiments of this application also provide an electronic device, including: one or more processors; one or more memories; wherein the one or more memories store one or more programs, and when the one or more programs are executed by the one or more processors, the electronic device performs the image processing method provided in the first aspect and various possible implementations.
[0041] Fourthly, some embodiments of this application also provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the image processing method provided in the first aspect and various possible implementations.
[0042] Fifthly, some embodiments of this application also provide a computer program product, characterized in that it includes a computer program / instructions that, when executed by a processor, implement the image processing method provided in the first aspect and various possible implementations described above. Attached Figure Description
[0043] Figure 1 A schematic diagram of a scenario using a neural network model for face recognition is shown.
[0044] Figure 2A A schematic flowchart of an image processing method is shown according to some embodiments of this application;
[0045] Figure 2B Schematic diagrams illustrating operator substitutions are shown according to some embodiments of this application;
[0046] Figure 3 According to some embodiments of this application, a schematic flowchart of a method for image processing using an average pooling operator combined with a compensation operator is shown.
[0047] Figure 4 A schematic diagram of a data structure for feature map data is shown according to some embodiments of this application;
[0048] Figure 5 According to an embodiment of this application, a schematic diagram of a specific implementation process for numerical compensation of mean results using an accumulation operator is shown.
[0049] Figure 6 According to an embodiment of this application, a schematic diagram of a specific implementation process for numerical compensation of mean results by a product operator is shown.
[0050] Figure 7 A block diagram of an electronic device is shown according to some embodiments of this application. Detailed Implementation
[0051] The illustrative embodiments of this application include, but are not limited to, image processing methods, electronic devices, and computer-readable storage media.
[0052] To facilitate understanding of the solutions in the embodiments of this application by those skilled in the art, some concepts and terms involved in the embodiments of this application will be explained below.
[0053] (1) Floating point
[0054] Floating-point numbers are a way of representing numbers in computers, using scientific notation. For example, the decimal fraction 8.345 can be represented in scientific notation in any of the following ways: 8.345 = 8.345 * 10^0; or 8.345 = 83.45 * 10^-1; or 8.345 = 834.5 * 10^-2. Floating-point numbers include 32-bit single-precision floating-point numbers, 64-bit double-precision floating-point numbers, and 32-bit standard floating-point numbers.
[0055] (2) Fixed-point number
[0056] Fixed-point numbers are a way of representing numbers in computers, used to represent both integers and decimals. The result of a fixed-point number can be expressed in binary as both the integer and fractional parts. In fixed-point numbers, the position of the decimal point and the number of bits used for the integer and fractional parts can be set as needed. It's understood that fixed-point numbers are typically used to represent integers, while floating-point numbers are usually used for high-precision decimals. For the same number, fixed-point numbers occupy fewer bits compared to floating-point numbers. For example, the fixed-point number 1.5 (D) = 00001 100 (B) occupies only 8 bits.
[0057] (3) Neural network quantization
[0058] Neural network quantization is used to convert floating-point operations in a neural network model into fixed-point operations, thereby reducing the size of the neural network and the amount of memory it requires. Quantization maps the weights or activation functions of a deep learning model from 32-bit floating-point numbers to a lower bit depth data representation. For example, most parameters in a neural network model are float32; by converting floating-point operations to fixed-point operations, they can be mapped to int8 or even fewer bits. This effectively reduces the model size and significantly improves instruction processing efficiency, accelerating the model's inference speed.
[0059] (4) Symmetric quantization
[0060] The principle of symmetric quantization: [X] min X max The floating-point number X is mapped to a range in [Q] through the quantization parameter. min Q max The fixed-point number Q within the range is given by the following formula (1.1):
[0061] Q = X * S (Formula 1.1)
[0062] The quantization parameters include the scaling factor S (scale), also known as the linear mapping coefficient. Understandably, for symmetric quantization, once the scaling factor S in the quantization parameters is determined, the quantized fixed-point number Q can be obtained from the input floating-point number X in combination with the quantization parameters.
[0063] It can be understood that the range of the fixed-point number Q corresponds to the quantization data type (i.e., the data type corresponding to the fixed-point number Q), therefore the zero-point offset is 0. The quantization data types include: int32, int16, int8, int4, uint32, uint16, uint8, or uint4, etc.
[0064] In the case of symmetric quantization, the following formula (1.2) shows a formula for obtaining the scaling factor S corresponding to the quantization parameter, where the range of the floating-point number X is [X...]. min X max ]:
[0065]
[0066] In formula (1.2), the max() function is the maximum value function. |*| represents taking the absolute value. It can be understood that the above formula (1.2) is only one way to obtain the scaling factor S corresponding to the quantization parameter, and it can also be obtained from other modified formulas, which are not restricted here.
[0067] For example, if the quantized data type is int8, meaning the fixed-point number Q is of type int8, then 2 n-1 -1 takes the value 127 (where n = 8).
[0068] (5) Asymmetric quantization
[0069] The principle of asymmetric quantization: the range is [X min X max The floating-point number X is mapped to a range in [Q] through the quantization parameter. min Q max Let Q be a fixed-point number within the range [[1]], as shown in the following formula (2.1):
[0070] Q = X * S' + Z' (Formula 2.1)
[0071] For asymmetric quantization, the quantization parameters corresponding to the asymmetric quantization parameters include the scaling factor S' and the zero point offset Z'.
[0072] Z' can be understood as the zero-point offset, which is the fixed-point value Q corresponding to a floating-point number X being 0. S' is the scaling factor. Once S' and Z' are determined, the quantized fixed-point number Q can be determined based on the input floating-point number X. The quantization data types are the same as those mentioned in the symmetric quantization section, including: int32, int16, int8, int4, uint32, uint16, uint8, or uint4, etc.
[0073] In some embodiments, the fixed-point number Q is an unsigned integral numeric type (UINT), whose range is [Q...]. min Q max Specifically, [0, 2] n -1], where n is the number of bits quantized. For example, if the fixed-point number Q is of type uint8, then [Q min Q max Specifically, [0, 2] n -1], take n=8, and determine the value as [0, 255].
[0074] For asymmetric quantization, equations (2.2) and (2.3) below show a formula for obtaining the scaling factor S' and zero offset Z' corresponding to the quantization parameters. Alternatively, the scaling factor S' and zero offset Z' for asymmetric quantization can also be obtained in other ways, which are not limited here.
[0075]
[0076] Z′=Q min -X min *S′ (Formula 2.3)
[0077] It is understandable that the difference between asymmetric quantization and symmetric quantization lies in whether the quantization value range restricts the correspondence between zero points before and after quantization.
[0078] (6) Weight Tensor
[0079] The weight tensor is a multidimensional array, a higher-dimensional extension of scalars, vectors, and matrices. Its purpose is to enable the creation of higher-dimensional matrices and vectors.
[0080] (7) Average operator
[0081] For example, the averaging operator can be used to calculate the average value of the input feature map data in the corresponding dimension.
[0082] The calculation process of the average operator is explained in detail below with reference to relevant formulas.
[0083] It is understandable that, in the floating-point field, for n x... i The average value y can be expressed by the following formula (3.1):
[0084]
[0085] Where n is a positive integer. To accelerate the computation process of the neural network, quantization can be performed before the feature map data is input into the averaging operator. In some embodiments, the feature map data values in the floating-point domain can be mapped to the fixed-point domain, and the average value of the specified dimension can be calculated using the feature map data in the fixed-point domain. Therefore, when calculating the average value of the feature map data in a specified dimension using the averaging operator, the average value of the fixed-point values corresponding to the feature map data in the specified dimension can be calculated.
[0086] The following section provides a detailed explanation of the quantization processing logic for feature map data, using relevant formulas.
[0087] For example, suppose the value corresponding to the feature map data in the floating-point domain is x. f The value corresponding to the feature map data in the fixed-point domain is x. q At this point, the feature map data can be mapped from the floating-point domain to the fixed-point domain using the following formula (3.2):
[0088]
[0089] Here, `round(*)` is a function used to round numbers to the nearest integer. For example, numbers can be rounded to the nearest integer. `linear_scale` refers to the linear mapping coefficient, also known as the scaling factor.
[0090] The linear mapping coefficients, linear_scale, can be determined by the following formula (3.3):
[0091]
[0092] The zero-point offset can be determined using the following formula (3.4):
[0093] zero_point = round(-min) xf *linear_scale)(Formula 3.4)
[0094] Substituting the above formula (3.4) and formula (3.3) into formula (3.2), we can determine the following formula (3.5):
[0095] x q =round(x f*linear_scale+zero_point) (Formula 3.5)
[0096] It can be understood that the aforementioned linear mapping coefficients are scaling factors. Therefore, linear mapping coefficients can be used to scale feature map data in the floating-point domain, and the sum of the scaled value and the zero-point offset can be used as the corresponding value x for the feature map data in the fixed-point domain. q This improves the mapping accuracy of floating-point numbers in the fixed-point domain. The quantization of each matrix element in the feature map data in the floating-point domain can be completed using the above formula (3.5).
[0097] In some embodiments, the averaging operator can be based on the numerical value x corresponding to the feature map data in the fixed-point domain in the above formula (3.5). q Averaging is then performed. Furthermore, based on formula (3.5), both the output feature map data y and the input feature map data x can be quantized. Substituting the floating-point and fixed-point numerical mapping relationship of formula (3.5) into the averaging formula (3.1), the following formula (3.6) can be obtained:
[0098]
[0099] Where zp_y represents the zero-point offset of the output feature map data, and zp_x represents the zero-point offset of the input feature map data. ys represents the linear mapping coefficients of the output feature map data, xs represents the linear mapping coefficients of the input feature map data, and yq and xiq represent the fixed-point values of the quantized feature map data. n represents the number of elements.
[0100] It is understandable that if the above quantization formula is implemented as symmetric quantization, then the value of zp_x will be 0.
[0101] It is understandable that simplifying formula (3.6) yields the following quantization formula (3.7) for the averaging operator:
[0102]
[0103] It is understandable that in the above formula It is a floating-point number, therefore, it can be represented as... Floating-point numbers are expressed using fixed-point notation, for example, using the following formula (3.8). The value:
[0104] P = A * 2 B (Formula 3.8)
[0105] Where P is the characterization The values are floating-point numbers, while A and B are fixed-point numbers. For example, when When the value is 0.5, 1*2 can be used. -1 Let 0.5 be used to represent this. At this point, the value of A is 1, and the value of B is -1. Both A and B can be expressed as fixed-point numbers, so that... The numerical value can be expressed as a fixed-point number. Therefore, the averaging operator first quantizes the feature map data, and then calculates the mean of the elements in the fixed-point domain corresponding to the quantized feature map data.
[0106] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions in the embodiments of this application will be described in detail below with reference to the accompanying drawings and specific implementation methods.
[0107] The image processing methods mentioned in this application can be used in any neural network model that includes an averaging operator, such as a neural network model for face recognition.
[0108] Figure 1 A schematic diagram of a scenario using a neural network model 20 for face recognition is shown.
[0109] refer to Figure 1 The terminal 10 contains a pre-trained neural network model 20. In an implementation for face recognition, after the terminal 10 acquires a user's face image, it can perform face recognition on the acquired face image using the aforementioned neural network model 20 to obtain the face recognition result. It can be understood that the aforementioned neural network model 20 can be a face recognition model trained using a face image training dataset.
[0110] The neural network model 20 includes an input layer, hidden layers, and an output layer. Each layer can include various operators for processing the image data input to the face recognition model, thereby identifying facial features in the image data. The mean operator is an operator that can be applied to various layers of the neural network model to process feature data. For example, in the face recognition model described above, the mean operator in the activation layer can average the feature data associated with the face image input from the previous layer according to a specified dimension, thereby obtaining new, processed feature data associated with the face image.
[0111] In some embodiments, the processor running the neural network model, such as a neural network processing unit (NPU), has a large number of hardware computing units stored on it. These hardware computing units can be used to perform batch data operations on the data input to the neural network model. However, only preset operators in a preset operator library can invoke these hardware computing units to perform batch data operations.
[0112] Because the averaging operator is not a commonly used operator, it is not included in the preset operator library. Therefore, the averaging operator cannot call upon the hardware computing unit used for batch calculations, resulting in the image processing speed of the averaging operator not being further improved by hardware; in other words, it cannot achieve or lacks hardware acceleration. Preset operators stored in the preset operator library, however, can call upon the hardware computing unit used for batch calculations, and can be considered to have hardware acceleration.
[0113] For example, let's label the operator without hardware acceleration as operator A, and the operator with hardware acceleration as operator B, where both operator A and operator B are used for addition operations. Then, operator A needs to perform calculations on the input feature map data one by one on the processor, while operator B can perform batch calculations on the input feature map data on the processor.
[0114] For example, consider a processor with 250 hardware computing units for operator B, but no hardware computing units that directly perform calculations on operator A. This illustrates that when adding two matrices with 250 elements, there are 250 addition operations on each matrix element. If operator A is used, the 250 hardware computing units must be invoked sequentially for each addition. However, if operator B is used, all 250 hardware computing units can be invoked simultaneously, resulting in a computation speed 250 times faster than operator A.
[0115] Therefore, since the averaging operator does not have hardware acceleration, the inference speed of the terminal 10 neural network model will be low.
[0116] To address the issue of slow inference speed in neural network models due to the lack of hardware acceleration in the averaging operator, this application provides an image processing method that replaces the averaging operator with an average pooling operator and a compensation operator that are hardware-accelerated. The compensation operator is used to compensate for the differences between the averaging operator and the average pooling operator, thereby enabling the averaging operator's algorithmic function to be realized through the average pooling operator that can perform hardware-accelerated calculations. This effectively improves the inference speed of neural network models that include the averaging operator.
[0117] Specifically, for example, in some embodiments, the averaging operator can be replaced with a hardware-accelerated average pooling operator and a compensation operator, wherein the compensation operator is used to numerically compensate the quantization parameters in the mean result calculated by the average pooling operator. The specific method is as follows:
[0118] According to the calculation formula (3.7) of the averaging operator above, the calculation logic of the averaging operator is similar to that of the average pooling operator. The calculation logic of the average pooling operator is shown in the following formula (4.1):
[0119]
[0120] Where zp_y represents the zero-point offset of the output feature map data, and zp_x represents the zero-point offset of the input feature map data. ys are the linear mapping coefficients of the output feature map data, xs are the linear mapping coefficients of the input feature map data, and yq and xiq are the fixed-point values of the quantized feature map data.
[0121] It can be seen that the difference between the average pooling operator and the average operator is that the average operator has additional linear mapping coefficients. Furthermore, the above formula (4.1) can be simplified to the following formula (4.2):
[0122]
[0123] Simplifying the above formula (4.2), we obtain the following formula (4.3):
[0124]
[0125] To reduce the computational cost of neural network models, we can typically set zp_x = zp_y in the above formula (4.3). This leads to the following formula (4.4) for calculating the average pooling operator:
[0126]
[0127] The formula for calculating the average operator (3.7) above can be simplified to obtain the following formula (3.9):
[0128]
[0129] From the calculation formulas (4.4) for the average pooling operator and (3.9) for the averaging operator, it can be seen that, compared with the average pooling operator, the averaging operator has the addition of linear mapping coefficients. It also includes the zero-point offset zp_x of the input feature map data and the zero-point offset zp_y of the output feature map data.
[0130] Therefore, the difference between the average pooling operator and the average operator is as follows:
[0131] (1) The average pooling operator lacks the zero offset zp_x of the input feature map data;
[0132] (2) The average pooling operator lacks linear mapping coefficients.
[0133] (3) The average pooling operator lacks the zero offset zp_y of the output feature map data.
[0134] Therefore, this application proposes a compensation operator that, when using the average pooling operator to replace the calculation of the average operator, optimizes the mean result (such as...) obtained by the average pooling operator. Numerical compensation is performed, specifically compensating for the zero-point offset zp_x and linear mapping coefficients of the input feature map data, which are not compensated for by average pooling compared to the average operator. And the zero-point offset zp_y of the output feature map data. For example, after the mean result calculated by the average pooling operator, the compensation operator can first add the zero-point offset zp_x to the mean result, and then multiply by The scaling is performed, and finally, the zero-point offset zp_y of the output feature map data is subtracted. This ensures that the output of the average pooling operator is consistent with the output of the average operator, allowing the hardware-accelerated average pooling operator combined with the compensation operator to completely replace the average operator. Since the average pooling operator has hardware acceleration, using the average pooling operator combined with the compensation operator to replace the average operator can effectively improve the inference speed of the neural network model.
[0135] In some embodiments, the average pooling operator can obtain the zero offset zp_x and linear mapping coefficients of the input feature map data during the quantization process. The average pooling operator uses three quantization parameters: the zero-point offset of the input feature map data (zp_x) and the zero-point offset (zp_y) of the output feature map data. However, these three quantization parameters are not used by the average pooling operator. The compensation operator can obtain the zero-point offset (zp_x) and linear mapping coefficients of the input feature map data from the average pooling operator. The three quantization parameters are: the zero-point offset zp_y of the output feature map data, and the compensation operator. Therefore, the compensation operator can numerically compensate the mean result calculated by the average pooling operator based on these three quantization parameters, so that the output result of the average pooling operator is consistent with the output result of the average operator.
[0136] In some embodiments, the above method can be used in quantized neural network models or in unquantized neural network models.
[0137] In some embodiments of this application, the processor described above includes, but is not limited to, a neural network processing unit (NPU).
[0138] The following is combined Figure 2A and Figure 2B The method and process for calculating feature map data using the average pooling operator and the compensation operator are described in detail.
[0139] Figure 2A A flowchart of an image processing method is shown according to some embodiments of this application. Figure 2A The specific process flow shown is as follows:
[0140] 201, Obtain the image to be processed.
[0141] For example, the image to be processed can be obtained through the neural network model 20. The image to be processed is the image input to the neural network model 20, such as a face image that requires face recognition, etc., without specific limitations.
[0142] 202, Neural network model 20 acquires feature maps during the image processing process.
[0143] For example, the neural network model 20 includes many layers, each capable of performing different computational processing on the image to be processed. For instance, the neural network model 20 may include an input layer, hidden layers, and an output layer. Each layer may include various operators for image processing of the image data input to the face recognition model, thereby identifying feature maps in the image data, which facilitates subsequent processing. For example, feature maps characterize the features in the image data, making it easier to compare similarity with other images later.
[0144] 203. During the processing of the feature map, neural network model 20 detects that the averaging operator 400 needs to be calculated.
[0145] For example, during the processing of the feature map, the neural network model 20 may call the averaging operator 400 for calculation. For instance, the averaging operator 400 can be called to calculate the average value of the feature map data input to the averaging operator 400 in the corresponding dimension.
[0146] As is understandable, the specific calculation process of the average operator 400 has been explained in detail above, and will not be repeated here.
[0147] 204. The average pooling operator 401 and the compensation operator 402 are invoked to perform the averaging operator 400 operation to obtain the target result. The averaging operator operation, which involves invoking the average pooling operator and the compensation operator, includes: performing the mean calculation of the average pooling operator through multiple parallel processing units.
[0148] For example, if the neural network model 20 detects that the averaging operator 400 needs to be called, since the averaging operator 400 cannot call multiple hardware processing units that are executed in parallel, the averaging pooling operator 401 and the compensation operator 402 can be called to execute the calculation logic of the averaging operator 400.
[0149] It can be understood that the aforementioned processing units are hardware-based processing units, such as a large number of hardware computing units stored on a processor. Since the mean calculation of the average pooling operator 401 can call multiple parallel-executed processing units, parallel computation can be achieved for the mean calculation of multiple feature map data, effectively accelerating the mean calculation process.
[0150] For example, the mean calculation process of the average pooling operator 401 has 250 hardware computing units on the processor. Therefore, the matrix elements input to the average pooling operator 401 can be processed simultaneously by these 250 hardware computing units. For instance, if the average pooling operator 401 inputs a total of 1001 matrix elements, 1000 accumulations are required. If the average operator 400 without hardware acceleration is used, it would take 1000 additions; however, using the hardware-accelerated average pooling operator 401, the time can be allocated to 4 additions, with 250 additions performed simultaneously each time, thus completing the accumulation of 1001 matrix elements. Therefore, for the same feature map input to a neural network model, replacing the hardware-accelerated average operator 400 with the hardware-accelerated average pooling operator 401 can significantly improve computational efficiency, thereby increasing the processor's inference speed for the neural network model.
[0151] It is understandable that, through steps 201 to 204 above, during the image processing process of the neural network model 20, feature maps are acquired. If it is detected that the feature map needs to be calculated using the averaging operator 400, the average pooling operator 401 and the compensation operator 402 can be invoked to execute the operation logic of the averaging operator 400, obtaining the target result consistent with the averaging operator 400. Since the mean calculation of the average pooling operator 401 can be performed in parallel by invoking multiple hardware processing units, compared to the average operator 400 which cannot invoke multiple hardware processing units in parallel, the average pooling operator 401 combined with the compensation operator 402 can effectively accelerate the processor's inference speed for the neural network model 20.
[0152] It is understandable, as can be seen in the following text. Figure 2B The neural network model 20 shown, which includes average pooling and compensation operators, performs the steps 204 to 204 described above.
[0153] The following is combined Figure 2B The neural network model 20, which includes the average pooling operator 401 and the compensation operator 402, is described in detail.
[0154] Figure 2B Some embodiments of this application illustrate the substitution of operators.
[0155] refer to Figure 2BThe neural network 20 includes an averaging operator 400. Here, the averaging operator 400 is replaced with an average pooling operator 401 and a compensation operator 402, so that the feature map data originally input to the averaging operator 400 is input to the average pooling operator 401. The average pooling operator 401 calculates the average value of the feature map data, and then the compensation operator 402 compensates for the quantization parameters of the average value to obtain the target result. It is understandable that the target result calculated by the average pooling operator 401 combined with the compensation operator 402 is consistent with the target result calculated by the averaging operator 400. Since at least one of the average pooling operator 401 and the compensation operator 402 has hardware acceleration, the computational efficiency of the processor is effectively improved.
[0156] Specifically, the average pooling operator 401 can treat each value in the high-dimensional matrix of feature map data as a single matrix element and group the feature map data according to preset conditions. Then, the average pooling operator 401 can average the grouped matrix elements to obtain the mean result.
[0157] Continue to refer to Figure 2B The average pooling operator 401 can input the mean result obtained above into the compensation operator 402, which then performs numerical compensation on the quantization parameters in the mean result. This allows the average pooling operator 401 and the compensation operator 402 to completely replace the average operator 400, obtaining the same target result as after the average operator processes the feature map data.
[0158] The following is combined Figure 3 The method of replacing the averaging operator 400 with the average pooling operator 401 and the compensation operator 402 is described in detail.
[0159] Figure 3 According to some embodiments of this application, a schematic flowchart of a method for image processing combining an average pooling operator 401 and a compensation operator 402 is shown. Figure 3 The specific process flow shown is as follows:
[0160] 301, the average pooling operator 401 acquires feature map data, performs quantization processing on the feature map data, and obtains feature map data and quantization parameters in the fixed-point domain.
[0161] For example, the average pooling operator 401 can obtain feature map data corresponding to the input image from a preset neural network. This feature map data is typically a high-dimensional matrix, such as a four-dimensional matrix, where each value is treated as a single matrix element. Each matrix element in the feature map data is converted from a floating-point number to a fixed-point number, thereby obtaining the feature map data and quantization parameters in the fixed-point domain.
[0162] As can be understood, based on formula (3.5) above, the average pooling operator can calculate the input feature map data to obtain the feature map data and quantization parameters in the fixed-point domain. These quantization parameters include, but are not limited to, zero-point offset and linear mapping coefficients (also known as scaling factors).
[0163] 302, the average pooling operator 401 groups the feature map data in the fixed-point domain based on preset conditions to obtain the grouped matrix elements.
[0164] For example, the preset condition can be a specified dimension, which facilitates the grouping and calculation of feature map data in the fixed-point domain based on actual needs, and obtains the grouped matrix elements.
[0165] It is understandable that by averaging across a specified dimension, the average value of all elements in that dimension can be calculated, thereby achieving effective dimensionality reduction of the feature map data in that specified dimension.
[0166] The following is combined Figure 4 The average calculation of feature map data of a specified dimension is described in detail in some embodiments of this application.
[0167] Figure 4 A schematic diagram of a data structure for feature map data is shown according to some embodiments of this application.
[0168] like Figure 4 As shown, feature map data 00 can be three-dimensional matrix data, including three dimensions: Z1, Z2, and Z3. Each value within the matrix data can be used as a single matrix element. It's understood that the dimensions specified above can also be one-dimensional or two-dimensional; these are just examples. Users can specify any dimension based on specific needs, and the maximum number of dimensions that can be specified is the actual dimensions of the feature map data. For example, if the feature map data is a four-dimensional matrix, then the maximum number of dimensions that can be specified is four.
[0169] The following introduces based on Figure 4 The process of averaging the feature map data shown.
[0170] First, the average pooling operator 401 can determine the range of feature map data for which mean calculation needs to be performed based on the specified dimensions of the input. When the specified dimensions are Z1 and Z2, Z1 combined with Z2 can determine a plane of Z3 dimensions.
[0171] Secondly, the average pooling operator 401 calculates the mean of the defined feature map data based on formula (4.1) above. First, each matrix element within a single plane is summed to obtain a cumulative sum. Then, the quotient of this cumulative sum and the number of matrix elements within the plane is used as the mean. Thus, Z3 mean results can be obtained. For example, the figure shows a 4*4 matrix in dimensions Z1 and Z2. This 4*4 matrix has 9 planes in dimension Z3. When the specified dimensions are Z1 and Z2, since a single plane can yield one mean result, a total of 9 mean results will be obtained. Furthermore, the mean calculation for a specified dimension can reduce the dimensionality of the feature map data in dimensions Z1 and Z2.
[0172] 303, the average pooling operator 401 calculates the average value of the elements of the grouped matrix to obtain the mean result.
[0173] For example, the average pooling operator 401 can calculate the average value of the matrix elements of a specified dimension using formula (4.1) above to obtain the mean result. It can be understood that the above mean result can be a multidimensional matrix composed of the calculated average values.
[0174] In some embodiments, the average calculation can be performed based on the corresponding matrix elements determined according to the user-specified dimension. For example, one-dimensional matrix elements, matrix elements in a two-dimensional plane, matrix elements in a three-dimensional cube, or matrix elements in multiple stacked four-dimensional cubes can be specified. The average value is then calculated by summing all matrix elements in the specified dimension to achieve effective dimensionality reduction of the feature map data in the specified dimension.
[0175] It is understood that the dimensions specified above, such as one-dimensional, two-dimensional, three-dimensional, and four-dimensional, are all examples. The dimensions that can be specified can be up to the maximum range of the actual dimensions of the feature map data, and no specific restrictions are imposed here.
[0176] 304, the compensation operator 402 performs numerical compensation on the mean result based on the quantization parameters to obtain the target result.
[0177] For example, the compensation operator 402 can numerically compensate the mean result based on the quantization parameters determined in the average pooling operator 401. For instance, all output elements of the average pooling operator can be compensated according to... Scaling is performed, and the zero-point offset zp_y of the output feature map data is compensated. This allows the hardware-accelerated average pooling operator, combined with the compensation operator, to completely replace the average operator. Since the average pooling operator is hardware-accelerated, replacing the average operator with the average pooling operator combined with the compensation operator can effectively improve the inference speed of the neural network model.
[0178] In some embodiments, the compensation operator 402 can be the accumulation operator 021, such as Eltwise Add.
[0179] In other embodiments, the compensation operator 402 may also be a product operator 022. This product operator 022 may be an operator that includes the activation function Leaky ReLU.
[0180] It is understandable that steps 301 to 304 above involve quantizing the feature map data using the average pooling operator 401 to obtain the feature map data and quantization parameters in the fixed-point domain, grouping the matrix elements to obtain grouped matrix elements, and calculating the average value based on the grouped matrix elements to obtain the mean result. Next, the compensation operator 402 performs scaling and offset compensation on the mean result according to the quantization parameters to obtain the target result. Thus, the average pooling operator 401 combined with the compensation operator 402 completely replaces the average operator 400. Since the average pooling operator 401 includes hardware acceleration, the accumulation operator 021 can also include hardware acceleration. Therefore, replacing the average operator 400 with the average pooling operator 401 combined with the accumulation operator 021 can effectively improve the inference speed of the neural network model.
[0181] The following is combined Figure 5 and Figure 6 The specific implementation process of step 304 above will be explained in detail.
[0182] Figure 5 According to an embodiment of this application, a schematic diagram of the specific implementation process of numerical compensation of the mean result by the accumulation operator 021 is shown.
[0183] Understandable. Figure 5 The execution entity for each step of the process shown can be the aforementioned accumulation operator 021. The execution entity for a single step will not be elaborated further.
[0184] in, Figure 5 The specific process flow shown is as follows:
[0185] 501, Get the mean result, retrieve the preset tensor from the preset storage path.
[0186] For example, the mean result obtained by the average pooling operator 401 based on the feature map data is obtained, and a preset tensor is obtained from a preset storage path.
[0187] It is understandable that, since the calculation logic of the accumulation operator 021 is the addition of two numbers, a preset tensor is needed to meet the input requirements of the accumulation operator 021.
[0188] In some embodiments, the accumulation operator 021 can be the Eltwise Add operator. The calculation logic of the Eltwise Add operator is explained in detail below with reference to relevant formulas.
[0189] For example, in the Eltwise Add operator, the input consists of two weight tensors, i.e., two high-dimensional matrices. The dimensions remain unchanged after adding the two weight tensors; only the numerical values are added one by one, thus increasing the information content in each dimension. The computational logic of the Eltwise Add operator can be expressed as the following formula (6.1):
[0190] R3 = R1 + R2 (Formula 6.1)
[0191] R1 and R2 are both weight tensors input to the Eltwise Add operator, and the sum of R1 and R2 is R3.
[0192] 502, sum the mean result with the preset tensor to obtain a summation result that is consistent with the mean result.
[0193] It is understandable that the accumulation operator 021 requires two input values, which is a limitation of the accumulation operator 021 algorithm itself. Therefore, in order to avoid the limitations of the accumulation operator 021 algorithm affecting the calculation results, a preset tensor can be stored in a preset storage path.
[0194] In some embodiments, all elements in the preset tensor can be set to a constant 0, making the accumulated result completely consistent with the mean result. This is to avoid changing the element values and dimensions in the mean result before compensating for the mean result based on the quantization parameters, and also to reduce the amount of computation.
[0195] 503. The accumulated result is scaled and offset compensated according to the quantization parameters to obtain the target result.
[0196] For example, the accumulation operator 021 can obtain the target result by scaling and offset compensation of the accumulation result according to the quantization parameters.
[0197] The specific calculation logic of the accumulation operator 021 is explained below in conjunction with formulas (6.2) and (6.3):
[0198]
[0199] Where y_eltwise_q is a single element value in the target result output by the Eltwise Add operator, x1q is a single matrix element in the mean result, zp_x1 is the zero offset of the mean result, x1s is the linear mapping coefficient of the mean result, x2q is a single matrix element in the preset tensor, zp_x2 is the zero offset of the preset tensor, zp_y is the zero offset of the target result, and x2s is the linear mapping coefficient of the preset tensor.
[0200] Since all elements in the preset tensor are constants of 0, making x2q = 0 and zp_x2 = 0, the above formula (6.2) can be simplified to the following formula (6.3):
[0201]
[0202] As shown in formula (6.3), the mean result x1q output by the average pooling operator is added to the zero offset zp_x1 of the mean result by the accumulation operator, completing the first mean compensation and obtaining the first compensation result (x1q+zp_x1). The first compensation result (x1q+zp_x1) is then compared with... Multiply, so that the first compensation result is based on Scaling is performed by subtracting the zero-point offset zp_y of the target result, completing the scaling and offset compensation of the accumulated result. This ensures that the target result obtained by combining the average pooling operator 401 with the accumulation operator 021 is consistent with the result obtained using the averaging operator 400. Furthermore, since the Eltwise Add operator also has hardware acceleration, it can call the hardware computing unit to perform batch calculations on the input mean results. Therefore, even when both the average pooling operator and the Eltwise Add operator are used to calculate the mean of the feature map data, the computational efficiency is still higher than that of the averaging operator alone, effectively improving the processor's computing efficiency.
[0203] It is understandable that steps 501 to 503 above involve retrieving a preset tensor from a preset storage path using the accumulation operator 021, accumulating the mean result with the preset tensor to obtain an accumulated result consistent with the mean result, and then scaling and offset compensation are applied to the accumulated result according to the quantization parameters to obtain the target result. This ensures that the target result is consistent with the result calculated by the averaging operator 400, achieving numerical compensation for the mean result. Since the accumulation operator 021 can include hardware acceleration, even when using the average pooling operator 401 in conjunction with the accumulation operator 021 to calculate the feature map data, it can still effectively improve the inference speed of the neural network model compared to using the averaging operator without hardware acceleration.
[0204] Figure 6 According to an embodiment of this application, a schematic diagram of a specific implementation process for numerical compensation of mean results by the product operator 022 is shown.
[0205] Understandable. Figure 6 The execution entity for each step of the process shown can be the aforementioned product operator 022. The execution entity for a single step will not be described in detail.
[0206] in, Figure 6 The specific process flow shown is as follows:
[0207] 601. Multiply the mean result by the preset linear component to obtain a product result with the same value as the mean result.
[0208] For example, the product operator 022 can be an operator that includes the activation function Leaky ReLU. The calculation logic of the activation function Leaky ReLU is explained in detail below with reference to relevant formulas:
[0209]
[0210] Based on the above formula (8.1), it can be seen that when the input x is greater than 0, Leaky ReLU directly outputs the original value; when the input x is less than or equal to 0, Leaky ReLU multiplies the original value with the preset linear component a.
[0211] It is understandable that the product operator 022 multiplies the negative input by a preset linear component, which is defined by the algorithm of the product operator 022 itself. Therefore, to avoid the preset linear component 'a' affecting the calculation result, in some embodiments, all elements in the preset linear component 'a' can be set to 1. This avoids changing the element values and dimensions in the mean result before performing numerical compensation based on the quantization parameters. Furthermore, compared to the accumulation operator 021 mentioned above, it does not need to retrieve data from a preset storage path, avoiding data retrieval waiting time and effectively improving computational efficiency.
[0212] 602. The product result is scaled and offset compensated according to the quantization parameters to obtain the target result.
[0213] The specific calculation logic of the product operator 022 will be explained in detail below with reference to formulas (8.2) and (8.3):
[0214]
[0215] Where y_leakyrelu_q is a single element value in the target result output by the product operator 022, x3q is a single matrix element in the mean result, zp_x3 is the zero offset of the mean result, x3s is the linear mapping coefficient of the mean result, and zp_y is the zero offset of the target result.
[0216] Since all elements in the preset linear component a are constants of 1, formula (8.2) can be simplified to the following formula (8.3):
[0217]
[0218] As shown in formula (8.3), the mean value output by the average pooling operator 401 is x3q. First, add the zero-point offset zp_x3 to the mean value to obtain the first compensation result (x3q + zp_x3). Then, combine the first compensation result with... Multiply, so that the first compensation result is based on Scaling is performed by subtracting the zero-point offset zp_y of the target result. This completes the scaling and offset compensation of the mean result, ensuring that the target result obtained by combining the average pooling operator 401 with the product operator 022 is consistent with the result obtained using the averaging operator 400. Furthermore, since the Leaky ReLU activation function also has hardware acceleration, even when the average pooling operator 402 and the product operator 022 containing the Leaky ReLU activation function are used together to calculate the mean of the feature map data, the computational efficiency is still higher than that of the averaging operator 401, effectively improving the processor's inference speed for the neural network model.
[0219] It can be understood that steps 601 to 602 above multiply the mean result with a preset linear component using the product operator 022 to obtain a product result with the same value as the mean result. Then, the product result is scaled and offset compensated according to the quantization parameters to obtain the target result. This ensures that the target result is consistent with the result calculated by the averaging operator 400, achieving numerical compensation for the mean result. Since the product operator 022 can include hardware acceleration, replacing the averaging operator 400 with the average pooling operator 401 combined with the product operator 022 can effectively improve the inference speed of the neural network model.
[0220] Some embodiments of this application also provide an image processing apparatus applied to an electronic device running a neural network model. The electronic device includes multiple processing units, and the apparatus includes: an image acquisition module for acquiring an image to be processed; a data processing module for acquiring feature maps during the processing of the image by the neural network model; an operator detection module for detecting the need for averaging operator calculation during the processing of the feature maps by the neural network model; and a computation module for calling an average pooling operator and a compensation operator to perform averaging operator calculations to obtain a target result. The calling of the average pooling operator and the compensation operator to perform averaging operator calculations includes: performing the mean calculation of the average pooling operator through multiple parallel processing units. Thus, the algorithmic function of the averaging operator is implemented through an average pooling operator capable of hardware-accelerated computation, thereby effectively improving the inference speed of the neural network model including the averaging operator.
[0221] Some embodiments of this application also provide an electronic device, which includes: a memory for storing instructions executed by one or more processors of the electronic device, and a processor, which is one of the one or more processors of the electronic device, for executing the image processing method.
[0222] It is understood that the electronic devices to which the image processing methods described above in the embodiments of this application are applicable may include, but are not limited to, mobile phones, foldable screen phones, tablet computers, desktop computers, laptop computers, handheld computers, netbooks, as well as augmented reality (AR) / virtual reality (VR) devices, smart TVs, smartwatches, and other electronic devices that obtain user operations through a screen, and no limitation is imposed here.
[0223] Figure 7 A block diagram of an electronic device is shown according to some embodiments of this application. In one embodiment, the electronic device 800 may include one or more processors 804, system control logic 808 connected to at least one of the processors 804, system memory 812 connected to the system control logic 808, non-volatile memory (NVM) 816 connected to the system control logic 808, and network interface 820 connected to the system control logic 808.
[0224] In some embodiments, processor 804 may include one or more single-core or multi-core processors. In some embodiments, processor 804 may include any combination of general-purpose processors and special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In embodiments where electronic device 800 employs an evolved node B (eNB) 101 or a radio access network (RAN) controller 102, processor 804 may be configured to perform various corresponding embodiments.
[0225] In some embodiments, system control logic 808 may include any suitable interface controller to provide any suitable interface to at least one of the processors 804 and / or any suitable device or component communicating with system control logic 808.
[0226] In some embodiments, system control logic 808 may include one or more memory controllers to provide an interface to system memory 812. System memory 812 may be used to load and store data and / or instructions. In some embodiments, the memory 812 of electronic device 800 may include any suitable volatile memory, such as suitable dynamic random access memory (DRAM).
[0227] NVM / memory 816 may include one or more tangible, non-transitory computer-readable media for storing data and / or instructions. In some embodiments, NVM / memory 816 may include any suitable non-volatile memory such as flash memory and / or any suitable non-volatile storage device, such as at least one of a hard disk drive (HDD), a compact disc (CD) drive, and a digital versatile disc (DVD) drive.
[0228] NVM / Storage 816 may include a portion of the storage resources on the device on which the electronic device 800 is installed, or it may be queried by the device, but is not necessarily part of the device. For example, NVM / Storage 816 may be queried over a network via network interface 820.
[0229] Specifically, system memory 812 and NVM / memory 816 may each include a temporary copy and a permanent copy of instruction 824. Instruction 824 may include instructions that, when executed by at least one of processors 804, cause electronic device 800 to implement the above-described construction method. In some embodiments, instruction 824, hardware, firmware, and / or its software components may additionally / alternatively be located in system control logic 808, network interface 820, and / or processor 804.
[0230] Network interface 820 may include a transceiver for providing a radio interface to electronic device 800, thereby enabling communication with any other suitable device (such as a front-end module, antenna, etc.) via one or more networks. In some embodiments, network interface 820 may be integrated into other components of electronic device 800. For example, network interface 820 may be integrated into at least one of processor 804, system memory 812, NVM / memory 816, and firmware device (not shown) with instructions, wherein electronic device 800 implements the above-described construction method when at least one of processor 804 executes the aforementioned instructions.
[0231] The network interface 820 may further include any suitable hardware and / or firmware to provide a multiple-input multiple-output radio interface. For example, the network interface 820 may be a network adapter, a wireless network adapter, a telephone modem, and / or a wireless modem.
[0232] In one embodiment, at least one of the processors 804 may be packaged together with the logic of one or more controllers for system control logic 808 to form a system-in-package (SiP). In another embodiment, at least one of the processors 804 may be integrated on the same die with the logic of one or more controllers for system control logic 808 to form a system-on-a-chip (SoC).
[0233] The electronic device 800 may further include an input / output (I / O) device 832. The I / O device 832 may include a user interface enabling a user to interact with the electronic device 800; the peripheral component interface is designed to allow peripheral components to also interact with the electronic device 800. In some embodiments, the electronic device 800 may also include sensors for determining at least one type of environmental condition and location information related to the electronic device 800.
[0234] In some embodiments, the user interface may include, but is not limited to, a display (e.g., a liquid crystal display, a touch screen display, etc.), a speaker, a microphone, one or more cameras (e.g., a still image camera and / or a video camera), a flashlight (e.g., a light-emitting diode flash), and a keyboard.
[0235] In some embodiments, the peripheral component interface may include, but is not limited to, a non-volatile memory port, an audio jack, and a power interface.
[0236] In some embodiments, the sensor may include, but is not limited to, a gyroscope sensor, an accelerometer, a proximity sensor, an ambient light sensor, and a positioning unit. The positioning unit may also be part of or interact with the network interface 820 to communicate with components of the positioning network (e.g., Global Positioning System (GPS) satellites).
[0237] The embodiments disclosed in this application can be implemented in hardware, software, firmware, or a combination of these implementation methods. Embodiments of this application can be implemented as computer programs or program code executable on a programmable system, the programmable system including at least one processor, a storage system (including volatile and non-volatile memory and / or storage elements), at least one input device, and at least one output device.
[0238] Program code can be applied to input instructions to execute the functions described in this application and generate output information. The output information can be applied to one or more output devices in a known manner. For the purposes of this application, the processing system includes any system having a processor such as, for example, a digital signal processor (DSP), a microcontroller, an application-specific integrated circuit (ASIC), or a microprocessor.
[0239] The program code can be implemented using a high-level procedural language or an object-oriented programming language to communicate with the processing system. Assembly language or machine language can also be used when needed. In fact, the mechanisms described in this application are not limited to any particular programming language. In either case, the language can be a compiled language or an interpreted language.
[0240] In some cases, the disclosed embodiments may be implemented in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried or stored thereon on one or more temporary or non-temporary machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors. For example, the instructions may be distributed via a network or through other computer-readable media. Therefore, machine-readable media may include any mechanism for storing or transmitting information in a machine-readable (e.g., computer-readable) form, including but not limited to floppy disks, optical disks, CD-ROMs, magneto-optical disks, read-only memory (ROM), random access memory (RAM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), magnetic cards or optical cards, flash memory, or tangible machine-readable storage for transmitting information (e.g., carrier waves, infrared signals, digital signals, etc.) using the Internet in the form of electrical, optical, acoustic, or other propagation signals. Therefore, machine-readable media include any type of machine-readable medium suitable for storing or transmitting electronic instructions or information in a machine-readable (e.g., computer-readable) form.
[0241] In the accompanying drawings, some structural or methodological features may be shown in a specific arrangement and / or order. However, it should be understood that such a specific arrangement and / or order may not be necessary. Rather, in some embodiments, these features may be arranged in a manner and / or order different from that shown in the illustrative drawings. Furthermore, the inclusion of structural or methodological features in a particular figure does not imply that such features are required in all embodiments, and in some embodiments, these features may be omitted or may be combined with other features.
[0242] It should be noted that all units / modules mentioned in the device embodiments of this application are logical units / modules. Physically, a logical unit / module can be a physical unit / module, a part of a physical unit / module, or a combination of multiple physical units / modules. The physical implementation of these logical units / modules themselves is not the most important factor; the combination of functions implemented by these logical units / modules is the key to solving the technical problems proposed in this application. Furthermore, to highlight the innovative aspects of this application, the above-described device embodiments of this application have not introduced units / modules that are not closely related to solving the technical problems proposed in this application. This does not mean that the above-described device embodiments do not contain other units / modules.
[0243] It should be noted that in the examples and description of this patent, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one" does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes the aforementioned element.
[0244] Although this application has been illustrated and described with reference to certain preferred embodiments thereof, those skilled in the art will understand that various changes in form and detail may be made thereto without departing from the scope of this application.
[0245] In this specification, the reference to "an embodiment" or "an embodiment" means that a specific feature, structure, or characteristic described in connection with the embodiment is included in at least one exemplary implementation or technology disclosed in this application. The phrase "in an embodiment" appearing in various places in the specification does not necessarily refer to the same embodiment.
[0246] This application also discloses means for performing operations in text. Such means may be specifically constructed for the claimed purpose or may include a general-purpose computer selectively activated or reconfigured by a computer program stored in a computer. Such a computer program may be stored on a computer-readable medium, such as, but not limited to, any type of disk, including floppy disks, optical disks, CD-ROMs, magneto-optical disks, read-only memory (ROM), random access memory (RAM), EPROM, EEPROM, magnetic or optical cards, application-specific integrated circuits (ASICs), or any type of medium suitable for storing electronic instructions, and each may be coupled to a computer system bus. Furthermore, the computer mentioned in the specification may include a single processor or may employ an architecture involving multiple processors for increased computing power.
[0247] The processes and displays presented herein do not inherently relate to any specific computer or other device. Various general-purpose systems may also be used with the programs taught herein, or it may prove convenient to construct more specialized devices to perform one or more method steps. Structures for various such systems are discussed in the following description. Furthermore, any specific programming language sufficient to implement the techniques and embodiments disclosed herein can be used. Various programming languages can be used to implement this disclosure, such as the image processing methods discussed herein.
[0248] Furthermore, the language used in this specification has been primarily chosen for readability and instructional purposes and may not have been chosen to depict or limit the subject matter disclosed. Therefore, this application disclosure is intended to illustrate, rather than limit, the scope of the concepts discussed herein.
Claims
1. An image processing method, characterized in that, An electronic device used to run a neural network model, wherein the electronic device includes multiple processing units, and, The method includes: Obtain the image to be processed; Neural network models acquire feature maps during the image processing process. During the processing of the feature map, the neural network model detects that an averaging operator needs to be calculated. The average pooling operator and the compensation operator are invoked to perform the average operator operation to obtain the target result. The invocation of the average pooling operator and the compensation operator to perform the average operator operation includes: performing the mean calculation of the average pooling operator through multiple parallel processing units. The step of calling the average pooling operator and the compensation operator to perform the averaging operation to obtain the target result includes: The average pooling operator is invoked to determine the fixed-point data and quantization parameters of the feature map based on the feature map. The quantization parameters are used to indicate the mapping relationship between the floating-point data and the fixed-point data of the feature map. The quantization parameters include the zero-point offset of the mean result, the linear mapping coefficient, and the zero-point offset of the output fixed-point data of the feature map. The average pooling operator is invoked to determine the corresponding mean result based on the fixed-point data of the feature map. The compensation operator is invoked to perform numerical compensation processing on the mean result according to the quantization parameters to obtain the target result. The compensation operator includes an accumulation operator. The invocation of the compensation operator to perform numerical compensation processing on the mean result according to the quantization parameters to obtain the target result includes: The accumulation operator is invoked to obtain a preset tensor from a preset storage location, wherein the value in the preset tensor is a constant 0; Multiple parallel-executed third processing units accumulate the preset tensor and the mean result to obtain an accumulated result consistent with the mean result. The zero-point offset of the accumulated result and the mean result are added together to obtain the first compensation result; The first compensation result and the linear mapping coefficient are multiplied to obtain the second compensation result; Subtract the zero-point offset of the fixed-point data of the output feature map from the second compensation result to obtain the target result.
2. The method according to claim 1, characterized in that, The step of calling the average pooling operator to determine the fixed-point data and quantization parameters of the feature map based on the feature map includes: The average pooling operator is invoked to group the feature map fixed-point data based on preset conditions to obtain the grouped feature map fixed-point data. The mean is calculated based on the fixed-point data of the grouped feature map to determine the corresponding mean result.
3. The method according to claim 2, characterized in that, The mean calculation of the average pooling operator includes addition and division calculations, and, The step of calculating the mean based on the grouped feature map fixed-point data and determining the corresponding mean result includes: Multiple parallel first processing units perform addition calculations on the grouped feature map fixed-point data to determine the corresponding first result. Multiple parallel second processing units perform division calculations on the grouped feature map fixed-point data to determine the corresponding mean result.
4. The method according to claim 2, characterized in that, The step of calling the average pooling operator to group the feature map fixed-point data based on preset conditions to obtain grouped feature map fixed-point data includes: The average pooling operator is invoked to group the fixed-point data of the feature map based on a specified dimension, resulting in grouped fixed-point data of the feature map.
5. The method according to claim 1, characterized in that, The compensation operator includes a product operator, and, The step of calling the compensation operator to perform numerical compensation processing on the mean result according to the quantization parameters to obtain the target result further includes: The product operator is invoked to perform a product calculation based on the preset linear component and the mean result, resulting in a product result consistent with the mean result, wherein the value of the preset linear component is a constant 1; The product result is numerically compensated based on the quantization parameters to obtain the target result.
6. The method according to claim 5, characterized in that, The step of calling the product operator to perform product calculation based on the preset linear component and the mean result includes: The product of the preset linear component and the mean result is calculated by multiple parallel fourth processing units.
7. The method according to claim 5, characterized in that, The step of performing numerical compensation processing on the product result according to the quantization parameters to obtain the target result includes: The zero-point offset of the product result and the mean result are added together to obtain the third compensation result; The fourth compensation result is obtained by multiplying the third compensation result and the linear mapping coefficient. The target result is obtained by subtracting the zero-point offset of the fixed-point data of the output feature map from the fourth compensation result.
8. An image processing apparatus, characterized in that, An electronic device used to run a neural network model, wherein the electronic device includes multiple processing units, and, The device includes: The image acquisition module is used to acquire the image to be processed. The data processing module is used by the neural network model to acquire feature maps during the image processing process. The operator detection module is used to detect when the neural network model needs to perform averaging operator calculation during the processing of the feature map. The calculation module is used to call the average pooling operator and the compensation operator to perform averaging operations and obtain the target result. The calling of the average pooling operator and the compensation operator to perform the averaging operations includes: The mean calculation of the average pooling operator is performed by multiple parallel processing units. The step of calling the average pooling operator and the compensation operator to perform the averaging operation to obtain the target result includes: The average pooling operator is invoked to determine the fixed-point data and quantization parameters of the feature map based on the feature map. The quantization parameters are used to indicate the mapping relationship between the floating-point data and the fixed-point data of the feature map. The quantization parameters include the zero-point offset of the mean result, the linear mapping coefficient, and the zero-point offset of the output fixed-point data of the feature map. The average pooling operator is invoked to determine the corresponding mean result based on the fixed-point data of the feature map. The compensation operator is invoked to perform numerical compensation processing on the mean result according to the quantization parameters to obtain the target result. The compensation operator includes an accumulation operator. The invocation of the compensation operator to perform numerical compensation processing on the mean result according to the quantization parameters to obtain the target result includes: The accumulation operator is invoked to obtain a preset tensor from a preset storage location, wherein the value in the preset tensor is a constant 0; Multiple parallel-executed third processing units accumulate the preset tensor and the mean result to obtain an accumulated result consistent with the mean result. The zero-point offset of the accumulated result and the mean result are added together to obtain the first compensation result; The first compensation result and the linear mapping coefficient are multiplied to obtain the second compensation result; Subtract the zero-point offset of the fixed-point data of the output feature map from the second compensation result to obtain the target result.
9. An electronic device, characterized in that, include: One or more processors; One or more memories; the one or more memories storing one or more programs, which, when executed by the one or more processors, cause the electronic device to perform the image processing method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, The storage medium stores instructions that, when executed on a computer, cause the computer to perform the image processing method according to any one of claims 1 to 7.