Face recognition model training method and apparatus, face recognition method, device, and medium
By mapping the initial training image and optimizing the gradient descent-based differentiable gated spatial transformation matrix, the problem of decreased face recognition accuracy in non-cooperative scenarios is solved, and efficient recognition of low-quality images is achieved.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- CHINA TELECOM ARTIFICIAL INTELLIGENCE TECHNOLOGY (BEIJING) CO LTD
- Filing Date
- 2025-11-25
- Publication Date
- 2026-06-18
AI Technical Summary
In non-cooperative scenarios, the accuracy of facial recognition technology decreases due to the lack of image features, and existing methods are unable to effectively improve the robustness of the model to low-quality images.
By mapping the initial training image using a differentiable gated space transformation matrix, face alignment error samples are generated. The matrix is then optimized using a gradient descent algorithm. Combined with backpropagation and a gating mechanism, the face recognition network is updated to generate the target face recognition model.
It improves the recognition accuracy of the face recognition model in non-cooperative scenarios, enhances its robustness to low-quality images, and improves the accuracy of face recognition.
Smart Images

Figure CN2025137373_18062026_PF_FP_ABST
Abstract
Description
Face recognition model training methods, face recognition methods, devices, equipment and media
[0001] Related applications
[0002] This application claims priority to Chinese patent application filed on December 9, 2024, with application number 202411805619X, entitled "Training method, application method, apparatus, device and medium for face recognition model", the entire contents of which are incorporated herein by reference. Technical Field
[0003] This application relates to the field of face recognition technology, and in particular to a training method for a face recognition model, a face recognition method, a device, equipment, and a medium. Background Technology
[0004] Facial recognition technology is also widely used in non-cooperative scenarios (such as security deployment, visitor monitoring, and contactless attendance). However, since the information collection device is usually not directly facing the face and the person being identified does not cooperate, the collected images often have problems such as missing features, which affects the accuracy of facial feature point recognition and thus the accuracy of facial recognition. Summary of the Invention
[0005] This application provides a training method, application method, device, equipment, and medium for a face recognition model to solve the technical problem that the lack of features in images acquired in non-cooperative scenarios affects the accuracy of face recognition.
[0006] In a first aspect, embodiments of this application provide a method for training a face recognition model, the method comprising:
[0007] Obtain the initial training image;
[0008] Based on the preset initial differentiable gated space transformation matrix, the initial training image is mapped to obtain the initial face alignment error sample. The initial differentiable gated space transformation matrix is a transformation matrix obtained based on differentiability and gated mechanism.
[0009] Based on the initial face alignment error samples, backpropagation is performed on the preset initial face recognition network to obtain the gradient descent change value of the initial differentiable gated space transformation matrix.
[0010] The initial differentiable gated space transformation matrix is updated based on the gradient descent change value using the gradient descent algorithm to generate target face alignment error samples.
[0011] The initial face recognition network is updated using the target face alignment error samples to obtain the target face recognition model.
[0012] In some embodiments, mapping the initial training image based on a preset initial differentiable gated spatial transformation matrix to obtain initial face alignment error samples includes:
[0013] The first image is obtained by remapping the pixel positions in the initial training image using an affine transformation matrix.
[0014] The first image is inversely transformed using the initial differentiable gated spatial transformation matrix to obtain the second image. The initial differentiable gated spatial transformation matrix is obtained by setting gate coefficients on the affine transformation matrix.
[0015] The second image is compared with the standard face template image to obtain the initial face alignment error sample.
[0016] In some embodiments, comparing the second image with the standard face template image to obtain the initial face alignment error sample includes:
[0017] The third image is obtained by interpolating the pixels in the second image using a bilinear interpolation algorithm.
[0018] The third image is compared with the standard face template image to obtain the initial face alignment error sample.
[0019] In some embodiments, the step of backpropagating a preset initial face recognition network based on the initial face alignment error samples to obtain the gradient descent change value of the initial differentiable gated space transformation matrix includes:
[0020] The initial face alignment error samples are input into the initial face recognition network to obtain the initial loss function;
[0021] The gradient descent change value is obtained by backpropagating the initial differentiable gated space transformation matrix through the initial loss function.
[0022] In some embodiments, updating the initial differentiable gated space transformation matrix based on the gradient descent change value using a gradient descent algorithm to generate target face alignment error samples includes:
[0023] The initial differentiable gated space transformation matrix is updated using the gradient descent algorithm based on the gradient descent change value, the preset step size, and the preset adversarial perturbation range, to obtain the target differentiable gated space transformation matrix.
[0024] The initial training image is mapped using the differentiable gated space transformation matrix to obtain the target face alignment error sample.
[0025] In some embodiments, updating the initial differentiable gated space transformation matrix using a gradient descent algorithm based on the gradient descent change value, a preset step size, and a preset adversarial perturbation range to obtain the target differentiable gated space transformation matrix includes:
[0026] According to the preset anti-disturbance range, the initial differentiable gated space transformation matrix is projected to obtain the first differentiable gated space transformation matrix;
[0027] Based on the gradient descent change value and the preset step size, the first differentiable gated space transformation matrix is iteratively updated to obtain the target differentiable gated space transformation matrix.
[0028] In some embodiments, updating the initial face recognition network using the target face alignment error samples to obtain the target face recognition model includes:
[0029] The target face alignment error sample is input into the initial face recognition network to obtain the target loss function;
[0030] The initial face recognition network is updated using the gradient descent algorithm based on the target loss function to obtain the target face recognition model.
[0031] Secondly, embodiments of this application provide a face recognition method, the method comprising:
[0032] Acquire the low-quality image to be identified;
[0033] The low-quality image to be identified is input into the target face recognition model to obtain the output face recognition result. The target face recognition model is trained by the face recognition model training method.
[0034] Thirdly, embodiments of this application provide a training device for a face recognition model, the device comprising:
[0035] The first acquisition module is used to acquire the initial training image;
[0036] The image mapping module is used to map the initial training image based on a preset initial differentiable gated space transformation matrix to obtain initial face alignment error samples. The initial differentiable gated space transformation matrix is a transformation matrix obtained based on differentiability and gate mechanism.
[0037] The backpropagation module is used to backpropagate the preset initial face recognition network based on the initial face alignment error samples to obtain the gradient descent change value of the initial differentiable gated space transformation matrix.
[0038] The matrix update module is used to update the initial differentiable gated space transformation matrix based on the gradient descent change value using the gradient descent algorithm, thereby generating target face alignment error samples.
[0039] The model update module is used to update the initial face recognition network using the target face alignment error samples to obtain the target face recognition model.
[0040] Fourthly, embodiments of this application provide an application device for a face recognition model, the device comprising:
[0041] The second acquisition module is used to acquire the low-quality image to be identified;
[0042] The model application module is used to input the low-quality image to be recognized into the target face recognition model and obtain the output face recognition result. The target face recognition model is trained by the face recognition model training method.
[0043] Fifthly, embodiments of this application provide an electronic device, including: a processor, a memory, and a computer program stored in the memory and executable on the processor, characterized in that the processor executes the program to implement the training method of the above-mentioned face recognition model or the application method of the face recognition model.
[0044] Sixthly, embodiments of this application provide a readable storage medium that, when the instructions in the readable storage medium are executed by the processor of an electronic device, enables the electronic device to execute the training method or application method of the face recognition model described above.
[0045] In summary, in this embodiment, an initial training image is obtained; based on a preset initial differentiable gated space transformation matrix, the initial training image is mapped to obtain initial face alignment error samples, wherein the initial differentiable gated space transformation matrix is a transformation matrix obtained based on differentiability and a gating mechanism; based on the initial face alignment error samples, a preset initial face recognition network is backpropagated to obtain the gradient descent change value of the initial differentiable gated space transformation matrix; using the gradient descent algorithm, the initial differentiable gated space transformation matrix is updated based on the gradient descent change value to generate target face alignment error samples; using the target face alignment error samples, the initial face recognition network is updated to obtain the target face recognition model. A low-quality image to be recognized is obtained; the low-quality image to be recognized is input into the target face recognition model to obtain the output face recognition result. In this scheme, by updating and optimizing the differentiable gated space transformation matrix, a gating mechanism is introduced to adjust the gradient to avoid excessive perturbation. Furthermore, the parameters are optimized by using the gradient back from the face recognition network. In addition, based on the network's learning progress, face alignment error samples are generated in a targeted manner, thereby improving the face recognition model's recognition accuracy for face recognition samples with alignment errors in non-cooperative scenarios.
[0046] Details of one or more embodiments of this application are set forth in the following drawings and description. Other features, objects, and advantages of this application will become apparent from the specification, drawings, and claims. Attached Figure Description
[0047] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0048] Figure 1 is a flowchart of the steps of a face recognition model training method provided in an embodiment of this application;
[0049] Figure 2 is a flowchart of another face recognition model training method provided in an embodiment of this application;
[0050] Figure 3 is a flowchart of another face recognition model training method provided in an embodiment of this application;
[0051] Figure 4 is a flowchart of the steps of an application method for a face recognition model provided in an embodiment of this application;
[0052] Figure 5 is a structural diagram of a face recognition model training device provided in an embodiment of this application;
[0053] Figure 6 is a structural diagram of an application device for a face recognition model provided in an embodiment of this application;
[0054] Figure 7 is a structural diagram of an electronic device provided in an embodiment of this application;
[0055] Figure 8 is a structural diagram of another electronic device provided in an embodiment of this application. Detailed Implementation
[0056] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of the embodiments of this application.
[0057] With the widespread application of facial recognition technology in non-cooperative scenarios (such as security deployment, visitor monitoring, and contactless attendance), higher demands are placed on the accuracy and robustness of facial recognition algorithms. In non-cooperative scenarios, because the acquisition device is usually not directly facing the face, and the person being recognized does not intentionally cooperate, the performance of the facial recognition system is often affected by image quality, such as large facial pose angles, blurriness, poor lighting conditions, or occlusion of facial features. In traditional facial recognition systems, facial alignment is a crucial step. First, key feature points in the facial image, such as the positions of the eyes, nose, and mouth, are detected. Then, based on this positional information, the facial image is aligned to a standard template. Accurate facial alignment is essential for the performance of facial recognition. However, in low-quality images, key facial information is often missing, inevitably affecting the localization of key feature points and facial alignment. Because traditional facial recognition models have high requirements for facial alignment accuracy, inaccurate localization of key feature points often leads to misidentification, thus affecting the effectiveness of facial recognition.
[0058] There are generally two solutions in related technologies. One is to build a large-scale dataset containing a sufficient number of low-quality samples, but due to privacy issues and acquisition costs of facial data, it is difficult to obtain enough data. The second is to augment existing data; however, many studies have shown that random data augmentation has no positive effect on face recognition tasks. Therefore, designing a face recognition method that is robust to low-quality aligned faces is very important.
[0059] Therefore, in order to solve the above problems, it is necessary to improve the robustness of the model to face alignment errors through algorithmic improvements without introducing additional data or reducing the original recognition accuracy. This would enhance the performance of traditional face recognition models in non-cooperative scenarios and improve their overall competitiveness.
[0060] The training method and application method of the face recognition model provided in the embodiments of this application will be described in detail below.
[0061] Figure 1 is a flowchart of a face recognition model training method provided in an embodiment of this application. Referring to Figure 1, the face recognition model training method may include steps 101-105.
[0062] Step 101: Obtain the initial training image.
[0063] In the embodiments of this application, a large number of images are usually required as samples for training the model. These samples can be low-quality images that have already undergone face recognition. In this way, the model can effectively learn face recognition for low-quality images through these samples.
[0064] It should be noted that the initial training image can be an image pre-stored in the database, an image uploaded by the user in real time, or an image obtained from the cloud via the network. This application embodiment does not impose any specific limitations.
[0065] Step 102: Based on the preset initial differentiable gated space transformation matrix, the initial image to be trained is mapped to obtain the initial face alignment error sample.
[0066] In facial recognition technology, since users' postures are all different, facial alignment is a key step when using a unified model for recognition. It is necessary to detect key feature points in the facial image, such as the nose, eyes, and mouth, and then align these key feature points to a standard template. This way, the model can perform recognition based on the unified standard template.
[0067] In this embodiment of the application, it is also necessary to align the initial training image to the standard template. Therefore, the initial training image can be mapped by a preset initial differentiable gated space transformation matrix to obtain the initial face alignment error sample.
[0068] It should be noted that the initial differentiable gated space transformation matrix can be understood as a transformation matrix obtained based on differentiability and the gated mechanism.
[0069] Transformation matrices are an important concept in linear algebra. In linear algebra, linear transformations can be represented by matrices. If T is a linear transformation that maps Rn to Rm, and x is a column vector with n elements, then an m×n matrix A is called the transformation matrix of T. Any linear transformation can be represented by matrices in a computationally pluralistic and consistent form, and multiple transformations can be easily connected together by matrix multiplication.
[0070] Gating mechanisms are typically associated with gating units in neural networks, such as the forget gate, input gate, and output gate in Long Short-Term Memory (LSTM) networks, and the update gate and reset gate in gated recurrent units (GRUs). These gating units process and learn from input data by controlling the flow of information. Although traditional gating mechanisms do not directly involve transformation matrices, the concept of gating can be extended to broader mathematical and engineering fields to achieve flexible control over spatial transformations.
[0071] In machine learning and deep learning, differentiability is a key property that allows algorithms to update model parameters through backpropagation, thereby optimizing model performance. For a transformation matrix, if its elements are part of the model parameters and these parameters are differentiable, then these parameters can be updated using optimization algorithms such as gradient descent.
[0072] It's worth noting that in image processing, differentiable spatial transformation matrices can be used to perform operations such as image rotation, scaling, and translation, with gating mechanisms controlling the intensity and range of these transformations. In machine learning, such transformation matrices can be incorporated into the model, and their parameters can be optimized through backpropagation.
[0073] It should be noted that mapping the initial training image can be understood as transforming each pixel in the initial training image to another image. For example, this transformation can include translation, rotation, affine mapping, perspective mapping, scale transformation, etc.
[0074] It should be noted that the initial face alignment error sample can be understood as the error between the mapped image and the standard template image. Since the initial training image is a low-quality image, some pixels may not be mapped or may be mapped to the wrong place during the mapping process. Therefore, there will be an error between the generated image and the standard template. This error can be called the initial face alignment error sample. That is, during the face alignment process, due to problems with the algorithm or data, there is a large deviation between the predicted feature point position and the actual feature point position. These deviations may manifest as feature point offset, missing, or misidentification.
[0075] In some embodiments, the reasons for the generation of the initial face alignment error sample may include factors such as the user's pose, expression, ambient lighting, and model algorithm.
[0076] When the angle of a face deviates significantly from that of a frontal view, it can cause self-occlusion of the face, making feature point recognition difficult and resulting in erroneous samples. For example, when the face is significantly shifted to the right, the feature points on the right side may move towards the center and become more concentrated, while the feature points on the left side may be relatively dispersed.
[0077] In low-quality images obtained from non-cooperative scenes, people may display a variety of expressions. For example, exaggerated expressions (such as surprise or laughter) can cause significant changes in the shape of the face, making the actual face shape differ greatly from the initial shape (neutral expression). This increases the difficulty of the algorithm's fitting and thus generates error samples.
[0078] Environmental factors also play a significant role. For example, in photos taken at night in poor lighting conditions, the visual features of a face may change drastically, making it difficult to accurately identify feature points and thus generating error samples.
[0079] Limitations or deficiencies in the algorithm itself may also lead to the generation of error samples. For example, weak robustness of the algorithm or insufficient adaptability to complex scenarios may result in inaccurate feature point recognition.
[0080] Step 103: Backpropagate the preset initial face recognition network based on the initial face alignment error samples to obtain the gradient descent change value of the initial differentiable gated space transformation matrix.
[0081] In this embodiment of the application, in order to improve the accuracy of face recognition model in recognizing faces in low-quality images, both the face recognition network and the differentiable gated space transformation matrix can be updated and optimized. Specifically, the initial face alignment error samples can be input into the initial face recognition network for backpropagation to perform adversarial data augmentation training, thereby obtaining the gradient descent change value of the initial differentiable gated space transformation matrix.
[0082] It's important to note that backpropagation is a method for training neural networks by adjusting network weights and biases based on output error. These networks are widely used in various tasks such as image recognition, speech recognition, and natural language processing. Its basic principle is to backpropagate the error between the neural network's output value and the expected value to each neuron according to its weights, and adjust the weights of each neuron based on this error. Specifically, for each neuron, the backpropagation algorithm calculates the error of that neuron, then multiplies this error by the neuron's input value and weights to obtain an adjustment value. This adjustment value is added to the neuron's weights, thus updating the neuron's weights in the next forward propagation. By continuously adjusting the weights, the backpropagation algorithm helps the neural network gradually learn more complex features and patterns, thereby improving the model's accuracy and generalization ability. The backpropagation algorithm is widely used in various types of neural networks, such as Multilayer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs). It is also widely used in various application areas, such as image classification, speech recognition, natural language processing, and recommender systems.
[0083] In this embodiment, the parameters of the preset initial face recognition network can be frozen for backpropagation, and then the gradient descent change value of the initial differentiable gated space transformation matrix can be obtained, so that the differentiable gated space transformation matrix can be updated and optimized subsequently through the gradient descent algorithm.
[0084] Step 104: The initial differentiable gated space transformation matrix is updated based on the gradient descent change value using the gradient descent algorithm to generate target face alignment error samples.
[0085] In this embodiment of the application, after obtaining the gradient descent change value of the initial differentiable gated space transformation matrix, the initial differentiable gated space transformation matrix can be updated by the gradient descent algorithm to obtain the optimized differentiable gated space transformation matrix. Based on the optimized differentiable gated space transformation matrix, the initial image to be trained is mapped again to obtain the target face alignment error sample.
[0086] It should be noted that gradient descent is a first-order iterative optimization algorithm used to solve problems that minimize objective functions. It is widely used in parameter optimization in machine learning and artificial intelligence. Gradient descent is an optimization algorithm based on the gradient information of a function. The gradient is a vector representing the direction in which the function changes the fastest (i.e., the rate of change is the largest) at a certain point. In machine learning, to find the minimum value of the objective function, an iterative search can be performed along the opposite direction of the gradient. In the embodiments of this application, updating the initial differentiable gated space transformation matrix using the gradient descent algorithm can also be understood as an iterative update. That is, the initial differentiable gated space transformation matrix is continuously updated by using the output as input until the output converges. At this point, the differentiable gated space transformation matrix can be considered the optimal differentiable gated space transformation matrix, and the iterative update is complete.
[0087] Step 105: Update the initial face recognition network using the target face alignment error samples to obtain the target face recognition model.
[0088] In this embodiment, after updating the initial differentiable gated space transformation matrix, the parameters of the differentiable gated space transformation matrix are fixed, and the initial face recognition network is iteratively updated using the obtained target face alignment error samples to obtain the target face recognition model. This face recognition model can be considered as a model that can effectively and accurately identify facial features in low-quality images.
[0089] In summary, in this embodiment, an initial training image is obtained; based on a preset initial differentiable gated space transformation matrix, the initial training image is mapped to obtain initial face alignment error samples; wherein, the initial differentiable gated space transformation matrix is a transformation matrix obtained based on differentiability and a gating mechanism; based on the initial face alignment error samples, a preset initial face recognition network is backpropagated to obtain the gradient descent change value of the initial differentiable gated space transformation matrix; through the gradient descent algorithm, the initial differentiable gated space transformation matrix is updated based on the gradient descent change value to generate target face alignment error samples; through the target face alignment error samples, the initial face recognition network is updated to obtain the target face recognition model. A low-quality image to be recognized is obtained; the low-quality image to be recognized is input into the target face recognition model to obtain the output face recognition result. In this scheme, by updating and optimizing the differentiable gated spatial transformation matrix, a gating mechanism is introduced to adjust the gradient to avoid excessive perturbation. Furthermore, the parameters are optimized by using the gradient back from the face recognition network. In addition, based on the network's learning progress, face alignment error samples are generated in a targeted manner, thereby improving the face recognition model's recognition accuracy for samples with alignment errors.
[0090] Figure 2 is a flowchart of another face recognition model training method provided in this application. Referring to Figure 2, the method may include the following steps 201-210.
[0091] Step 201: Obtain the initial training image.
[0092] In this embodiment, the description of step 201 is the same as the detailed description of step 101 in the above embodiments, and will not be repeated in this embodiment.
[0093] Step 202: Remap the positions of the pixels in the initial training image using an affine transformation matrix to obtain the first image.
[0094] In this embodiment, the mapping of the initial training image may include remapping, which can be implemented through an affine transformation matrix. An affine transformation is a linear transformation between two-dimensional coordinates, specifically including operations such as translation, rotation, scaling, and shearing, but without allowing perspective (i.e., maintaining the parallelism and proportions of lines). An affine transformation can be represented by a 3x3 matrix, where the last row is typically [0, 0, 1], used for homogeneous coordinate representation.
[0095] In some embodiments, a typical affine transformation matrix takes the following form:
[0096] Among them, a and e control scaling and rotation, b and d control shearing, and c and f control translation.
[0097] Step 203: Perform an inverse transformation on the first image using the initial differentiable gated space transformation matrix to obtain the second image.
[0098] In this embodiment, the initial differentiable gated space transformation matrix is obtained by setting gate coefficients on the affine transformation matrix.
[0099] In some embodiments, for any pixel (i, j) in the initial training image, the remapping can be expressed by the following formula:
[0100] Among them, P v (i) represents the remapping result of the horizontal coordinate i, P u (j) represents the remapping result of the ordinate j.
[0101] In some embodiments, the initial differentiable space transformation matrix operation can be defined as T θ Its parameter θ is defined as:
[0102] Where φ∈[0,2π] represents the rotation angle, Δu and Δv represent the horizontal and vertical displacements respectively, λ represents the scaling factor, and tanh(α), tanh(β), tanh(γ) and tanh(δ) are gating coefficients used to prevent the face image from changing too much during the transformation process and to ensure the stability of the optimization process.
[0103] In some embodiments, for each location (P) v (i), P u (j)), after T θ The mapped position representation is as follows:
[0104] in, It can represent the inverse transformation of the initial differentiable space transformation matrix, thus obtaining the second image (m, n).
[0105] Step 204: Compare the second image with the standard face template image to obtain the initial face alignment error sample.
[0106] In this embodiment of the application, the second image can be considered as an image obtained after mapping. By comparing the second image with the standard face template image, the error of the low-quality image can be obtained, that is, the initial face alignment error sample can be obtained.
[0107] In some embodiments, the second image and the standard face template image are compared to obtain an initial face alignment error sample, which may specifically include the following steps:
[0108] Sub-step 2041 involves interpolating the pixels in the second image using a bilinear interpolation algorithm to obtain the third image.
[0109] Sub-step 2042 compares the third image with the standard face template image to obtain the initial face alignment error sample.
[0110] It should be noted that since (m, n) may not be aligned with the integer pixel indices in the input image, bilinear interpolation B is introduced to calculate the specific pixel values. For the transformed image I^ from the original image I, the value of each pixel can be represented as:
[0111] Among them, P v (i) represents the remapping result of the horizontal coordinate i, P u (j) represents the remapping result of the ordinate j. It is an inverse transform, where B represents bilinear interpolation. Represents the coordinates in the transformed image.
[0112] It's important to note that bilinear interpolation, also known as bilinear interpolation, is a linear extension of an interpolation function with two variables. Its core idea is to perform linear interpolation in two directions (usually X and Y). Bilinear interpolation is based on the assumption of linear change, meaning that pixel values change linearly between two known points. By finding the four pixels closest to the target point, and using their pixel values, the target point's pixel value is calculated through two single linear interpolations (X then Y or Y then X). In image processing, bilinear interpolation can achieve smoother image quality and avoid abrupt changes in pixel values. However, because bilinear interpolation has the properties of a low-pass filter, it may cause some blurring of the image.
[0113] Suppose that the function f is known in Q 11 = (x1, y1), Q 12 = (x1, y2), Q 21 = (x2, y1) and Q 22 To find the value of the unknown function f at point P = (x, y) from the four points (x2, y2), we can first perform interpolation in the x-direction to obtain the values of R1 and R2, and then perform linear interpolation in the y-direction to obtain the value of P.
[0114] Thus, the third image obtained after interpolation is compared with the standard face template image to obtain the initial face alignment error sample.
[0115] Step 205: Input the initial face alignment error samples into the initial face recognition network to obtain the initial loss function.
[0116] In the embodiments of this application, during the training process of adversarial data augmentation, an initial loss function can be obtained first. This initial loss function can be a common loss function in face recognition, such as cosface, arcface, etc.
[0117] Step 206: Backpropagate the initial differentiable gated space transformation matrix using the initial loss function to obtain the gradient descent change value.
[0118] It should be noted that the backpropagation algorithm can be understood as follows: input data is passed from the input layer through the hidden layers to the output layer; the network calculates the output of each node through the connections (i.e., weights) between layers, ultimately generating the network's prediction result; then, the network's predicted output is compared with the true value, and a loss function (such as mean squared error) is calculated; then, the error is backpropagated to each neuron according to the weights, and the weights of each neuron are adjusted according to the error; the above steps are repeated until a preset number of iterations is reached or other stopping conditions are met. It can be seen that backpropagation is the process of updating parameters through iterative iteration of the loss function. Therefore, in this embodiment, the initial differentiable gated space transformation matrix is backpropagated using the initial loss function. To ensure training stability, step-by-step parameter updates are adopted during the update process. Therefore, optimization can be performed according to the gradient of the initial differentiable gated space transformation matrix, that is, obtaining the gradient descent change value of the initial differentiable gated space transformation matrix.
[0119] In some embodiments, the optimization objective is to find a suitable set of θ that maximizes the learning objective L, i.e., to attempt to generate samples that have little impact on humans but can cause the machine learning model to make incorrect predictions: θ * =argmax θ L(F ψ ;T θ (I),y)
[0120] Where y represents the identity information corresponding to face image I, and L represents a commonly used loss function for face recognition (such as cosface, arcface, etc.). Finally, the initial differentiable gated space transformation matrix T is... θ The parameters θ and the face recognition network F ψ Combining the parameters ψ, the overall optimization objective is:
[0121] In the context of `argmax g(t)`, "arg" represents a subset of the domain whose elements maximize the function `g(t)`. The `argmax` function is used to find the set of parameters or variables that maximize the objective function, while the `arg` function in the context of `argmax` represents the elements of this set. Similarly, in the context of `argmin g(t)`, "arg" represents a subset of the domain whose elements minimize the function `g(t)`. The `argmin` function is used to find the set of parameters or variables that minimize the objective function, while the `arg` function in the context of `argmin` represents the elements of this set.
[0122] Step 207: The initial differentiable gated space transformation matrix is updated using the gradient descent algorithm based on the gradient descent change value, the preset step size, and the preset adversarial perturbation range, to obtain the target differentiable gated space transformation matrix.
[0123] It's important to note that gradient descent is an optimization algorithm based on the gradient information of a function. The gradient is a vector representing the direction in which the function changes most rapidly (i.e., the rate of change is greatest) at a given point. In machine learning, to find the minimum value of the objective function, an iterative search can be performed along the opposite direction of the gradient.
[0124] In machine learning, the objective function is typically the loss function, used to evaluate the model's performance. The gradient is obtained by differentiating the objective function. The gradient is a vector whose direction is the direction in which the function value increases most rapidly, and its magnitude is the rate of change of the function in that direction. Then, the parameters are updated in the opposite direction of the gradient; that is, the new values of the parameters are equal to the old values minus the gradient multiplied by a step size (also called the learning rate). These steps are repeated iteratively until a termination condition is met (such as the magnitude of the gradient vector approaching zero, or reaching the maximum number of iterations).
[0125] In some embodiments, the initial differentiable gated space transformation matrix is updated using a gradient descent algorithm based on the gradient descent change value, a preset step size, and a preset adversarial perturbation range to obtain the target differentiable gated space transformation matrix. Specifically, this may include the following steps 2071-2072.
[0126] Sub-step 2071: Project the initial differentiable gated space transformation matrix according to the preset anti-disturbance range to obtain the first differentiable gated space transformation matrix.
[0127] Sub-step 2072: Based on the gradient descent change value and the preset step size, iteratively update the first differentiable gated space transformation matrix to obtain the target differentiable gated space transformation matrix.
[0128] In some embodiments, during the training process of updating the differentiable gated space transformation matrix, a step-by-step parameter update is adopted to ensure training stability. The initial face recognition network F can be frozen first. ψ The parameters are backpropagated using gradient descent, and then T is obtained. θ The gradient of (I) is optimized by k-step projective gradient descent, specifically as follows: D={θ|||T θ (I)-I||2≤ρ}
[0129] Where sgn represents the sign function, σ is the optimization step size, and D is the allowable adversarial perturbation range, which stipulates that the 2-norm of the face alignment error sample and the original sample is less than the radius ρ, and proj D(d) means projecting d onto the interval specified by D.
[0130] This can be understood as the range of adversarial perturbation being the range of face alignment error samples that can be set, which means the range of the differentiable gated space transformation matrix can be limited. Therefore, the gradient descent calculation can be performed after projecting the differentiable gated space transformation matrix according to the adversarial perturbation range.
[0131] Step 208: Map the initial training image to the target differentiable gated space transformation matrix to obtain the target face alignment error sample.
[0132] In this embodiment, after updating and optimizing the differentiable gated space transformation matrix, the parameters of the optimized target differentiable gated space transformation matrix can be fixed. The face recognition network can then be trained based on this optimized target differentiable gated space transformation matrix. Therefore, target face alignment error samples can be obtained first. Since the target differentiable gated space transformation matrix is an optimized and fixed matrix, the target face alignment error samples obtained by mapping the initial training image through the target differentiable gated space transformation matrix can be considered as samples that combine differentiable space transformation, adversarial data augmentation, and gating constraints, and can be used to train the face recognition model.
[0133] Step 209: Input the target face alignment error samples into the initial face recognition network to obtain the target loss function.
[0134] In this embodiment, the target loss function can be expressed by the following formula: loss = L(F ψ ;I,y)+L(F ψ ;T θ (I),y)+l2(F ψ (T θ (I)),F ψ (I))
[0135] Here, l2 is the squared loss function, which further enhances the network's anti-interference ability by narrowing the feature distance between the target face alignment error sample and the original sample.
[0136] Step 210: The initial face recognition network is updated based on the target loss function using the gradient descent algorithm to obtain the target face recognition model.
[0137] In this embodiment of the application, the process of updating the initial face recognition network is the same as the process of iteratively updating the differentiable gated space transformation matrix in the above steps. Both can be considered to be based on the loss function and implemented through the gradient descent algorithm, which will not be elaborated here.
[0138] In summary, in this embodiment, by updating and optimizing the differentiable gated space transformation matrix, a gating mechanism is introduced to adjust the gradient and avoid excessive perturbation. Furthermore, the parameters are optimized using the gradient returned by the face recognition network. Additionally, based on the network's learning progress, targeted face alignment error samples are generated, improving the face recognition model's accuracy for samples with alignment errors. Without introducing additional data or reducing the original recognition accuracy, algorithmic improvements specifically enhance the model's robustness to face alignment errors, thereby improving the recognition accuracy of existing models in non-cooperative scenarios.
[0139] Figure 3 is a flowchart of a face recognition model training method provided in an embodiment of this application. Referring to Figure 3, the face recognition model training method may include steps 301-306.
[0140] Step 301: Generate a differentiable gated spatial transformation matrix, define the mapping rules for each pixel, and initialize the differentiable gated spatial transformation matrix.
[0141] Step 302: Generate initial face alignment error samples.
[0142] Step 303: The face alignment error samples and the original samples are fed into the face recognition network to calculate the loss function.
[0143] Step 304: Freeze the face recognition network parameters and perform backpropagation to obtain the gradient of the differentiable gated space transformation matrix.
[0144] Step 305: Define the range of adversarial perturbation and optimize the differentiable gated space transformation matrix through k-step projective gradient descent.
[0145] Step 306: Fix the differentiable gated space transformation matrix, generate face alignment error samples, and feed them together with the original samples into the face recognition network to calculate the loss function and optimize the face recognition network.
[0146] As can be seen from Figure 3, the updating and optimization of the differentiable gated space transformation matrix and the face recognition network in this embodiment are iterative processes, that is, they are continuously cyclical. The loss function is continuously generated based on the error samples, and then the differentiable gated space transformation matrix and the face recognition network are updated and optimized again in a cyclical manner.
[0147] This application combines differentiable space transformation and adversarial data augmentation to dynamically generate alignment error samples that maximize training benefits, embedding differentiable space transformation and adversarial data augmentation into the existing face recognition model training process.
[0148] In the differentiable space transformation part, a differentiable gated space transformation matrix was designed, and a gating mechanism was introduced to adjust the gradient and avoid excessive perturbation. The parameters were optimized using the gradient back from the face recognition network. During optimization, k-step projective gradient descent was introduced, and an adversarial range was designed as an optimization constraint to ensure that the optimization process was efficient and stable.
[0149] In the adversarial data augmentation section, an optimized adversarial training process was constructed for the face recognition problem. A step-by-step parameter update method was adopted to maximize the adversarial loss while minimizing the face recognition loss. Based on the network's learning progress at each time step, face alignment error samples were generated in a targeted manner. This improved the face recognition model's accuracy in recognizing samples with alignment errors without negatively impacting samples without alignment errors.
[0150] Figure 4 is a flowchart illustrating the steps of an application method for a face recognition model provided in an embodiment of this application. The application method for the face recognition model is also known as a face recognition method. Referring to Figure 4, the face recognition method may include:
[0151] Step 401: Obtain the low-quality image to be identified.
[0152] In this embodiment of the application, the low-quality image may be an image in which the user's posture is not directly facing the camera, or an image in which the user's facial expression is large, or an image in which the ambient lighting is extreme, etc.
[0153] Step 402: Input the low-quality image to be recognized into the target face recognition model to obtain the output face recognition result.
[0154] The target face recognition model is trained using the face recognition model training method described in the above embodiments.
[0155] In summary, by inputting the low-quality image to be recognized into the target face recognition model, the desired face recognition result can be obtained, which can improve the recognition accuracy of the face recognition model for samples with alignment errors in non-cooperative scenarios.
[0156] Figure 5 is a schematic diagram of the structure of a face recognition model training device provided in an embodiment of this application. The face recognition model training device 500 may include:
[0157] The first acquisition module 501 is used to acquire the initial image to be trained;
[0158] The image mapping module 502 is used to map the initial image to be trained based on the preset initial differentiable gated space transformation matrix to obtain the initial face alignment error sample. The initial differentiable gated space transformation matrix is the transformation matrix obtained based on differentiability and gate mechanism.
[0159] The backpropagation module 503 is used to backpropagate the preset initial face recognition network based on the initial face alignment error samples to obtain the gradient descent change value of the initial differentiable gated space transformation matrix.
[0160] The matrix update module 504 is used to update the initial differentiable gated space transformation matrix based on the gradient descent change value using the gradient descent algorithm, and generate target face alignment error samples.
[0161] The model update module 505 is used to update the initial face recognition network using the target face alignment error samples to obtain the target face recognition model.
[0162] In some embodiments, the image mapping module 502 may include:
[0163] The remapping submodule is used to remap the positions of pixels in the initial training image using an affine transformation matrix to obtain the first image.
[0164] The matrix inverse transformation submodule is used to perform an inverse transformation on the first image to obtain the second image using an initial differentiable gated spatial transformation matrix. The initial differentiable gated spatial transformation matrix is obtained by setting gate coefficients on the affine transformation matrix.
[0165] The image comparison submodule is used to compare the second image with the standard face template image to obtain the initial face alignment error sample.
[0166] In some embodiments, the image comparison submodule may include:
[0167] The interpolation unit is used to interpolate the pixels in the second image using a bilinear interpolation algorithm to obtain the third image.
[0168] The image comparison unit is used to compare the third image with the standard face template image to obtain the initial face alignment error sample.
[0169] In some embodiments, the backpropagation module 503 may include:
[0170] The first loss function generation submodule is used to input the initial face alignment error samples into the initial face recognition network to obtain the initial loss function;
[0171] The backpropagation submodule is used to backpropagate the initial differentiable gated space transformation matrix using the initial loss function to obtain the gradient descent change value.
[0172] In some embodiments, the matrix update module 504 may include:
[0173] The matrix update submodule is used to update the initial differentiable gated space transformation matrix using the gradient descent algorithm, based on the gradient descent change value, the preset step size, and the preset adversarial perturbation range, to obtain the target differentiable gated space transformation matrix.
[0174] The image mapping submodule is used to map the initial training image through the target differentiable gated space transformation matrix to obtain target face alignment error samples.
[0175] In some embodiments, the matrix update submodule may include:
[0176] The matrix projection unit is used to project the initial differentiable gated space transformation matrix according to the preset anti-disturbance range to obtain the first differentiable gated space transformation matrix.
[0177] The matrix update unit is used to iteratively update the first differentiable gated space transformation matrix according to the gradient descent change value and the preset step size to obtain the target differentiable gated space transformation matrix.
[0178] In some embodiments, the model update module 505 may include:
[0179] The second loss function generation submodule is used to input the target face alignment error samples into the initial face recognition network to obtain the target loss function;
[0180] The model update submodule is used to update the initial face recognition network based on the target loss function using the gradient descent algorithm to obtain the target face recognition model.
[0181] As for the training device in the foregoing embodiments, since its principle is basically similar to the training method in the foregoing embodiments, the description is relatively simple. For relevant details, please refer to the description of the method embodiments.
[0182] Figure 6 is a schematic diagram of the structure of an application device for a face recognition model provided in an embodiment of this application. The application device 600 for the face recognition model may include:
[0183] The second acquisition module 601 is used to acquire the low-quality image to be identified;
[0184] The model application module 602 is used to input the low-quality image to be recognized into the target face recognition model and obtain the output face recognition result. The target face recognition model is trained by the face recognition model training method of the above embodiment.
[0185] As for the application device in the foregoing embodiments, since its principle is basically similar to that in the foregoing embodiments, the description is relatively simple, and relevant parts can be referred to in the description of the method embodiments.
[0186] Referring to Figure 7, the electronic device 700 may include one or more of the following components: processing component 702, memory 704, power supply component 706, multimedia component 708, audio component 710, input / output (I / O) interface 712, sensor component 714, and communication component 716.
[0187] Processing component 702 typically controls the overall operation of electronic device 700, such as operations associated with display, telephone calls, data communication, camera operation, and recording operations. Processing component 702 may include one or more processors 720 to execute instructions to complete all or part of the steps of the methods described above. Furthermore, processing component 702 may include one or more modules to facilitate interaction between processing component 702 and other components. For example, processing component 702 may include a multimedia module to facilitate interaction between multimedia component 708 and processing component 702.
[0188] Memory 704 is used to store various types of data to support the operation of electronic device 700. Examples of this data include instructions for any application or method operating on electronic device 700, contact data, phonebook data, messages, pictures, multimedia, etc. Memory 704 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.
[0189] Power supply component 706 provides power to various components of electronic device 700. Power supply component 706 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to electronic device 700.
[0190] Multimedia component 708 includes an interface that provides an output interface between electronic device 700 and a user. In some embodiments, the interface may include a liquid crystal display (LCD) and a touch panel (TP). If the interface includes a touch panel, the interface may be implemented as a touchscreen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensors may not only sense the boundaries of touch or swipe actions but also detect the duration and pressure associated with the touch or swipe operation. In some embodiments, multimedia component 708 includes a front-facing camera and / or a rear-facing camera. When electronic device 700 is in an operating mode, such as a shooting mode or a multimedia mode, the front-facing camera and / or rear-facing camera may receive external multimedia data. Each front-facing camera and rear-facing camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
[0191] Audio component 710 is used to output and / or input audio signals. For example, audio component 710 includes a microphone (MIC) used to receive external audio signals when electronic device 700 is in an operating mode, such as call mode, recording mode, and voice recognition mode. The received audio signals may be further stored in memory 704 or transmitted via communication component 716. In some embodiments, audio component 710 also includes a speaker for outputting audio signals.
[0192] Input / output (I / O) interface 712 provides an interface between processing component 702 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to, home buttons, volume buttons, power buttons, and lock buttons.
[0193] Sensor assembly 714 includes one or more sensors for providing state assessments of various aspects of electronic device 700. For example, sensor assembly 714 may detect the on / off state of electronic device 700, the relative positioning of components such as the display and keypad of electronic device 700, changes in position of electronic device 700 or a component of electronic device 700, the presence or absence of user contact with electronic device 700, orientation or acceleration / deceleration of electronic device 700, and temperature changes of electronic device 700. Sensor assembly 714 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 714 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, sensor assembly 714 may also include an accelerometer, gyroscope, magnetometer, pressure sensor, or temperature sensor.
[0194] Communication component 716 facilitates wired or wireless communication between electronic device 700 and other devices. Electronic device 700 can access wireless networks based on communication standards, such as WiFi, carrier networks (such as 2G, 3G, 4G, or 5G), or combinations thereof. In one exemplary embodiment, communication component 716 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, communication component 716 also includes a near-field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
[0195] In an exemplary embodiment, the electronic device 700 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to implement a vehicle-road cooperative scenario demonstration method provided in this application embodiment.
[0196] In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions is also provided, such as a memory 704 including instructions, which can be executed by a processor 720 of an electronic device 700 to perform the above-described method. For example, the non-transitory storage medium may be a ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.
[0197] Figure 8 is a block diagram of an electronic device 800 according to another embodiment of the present invention. For example, the electronic device 800 may be provided as a server. Referring to Figure 8, the electronic device 800 includes a processing component 822, which further includes one or more processors, and memory resources represented by a memory 832 for storing instructions executable by the processing component 822, such as application programs. The application programs stored in the memory 832 may include one or more modules, each corresponding to a set of instructions. Furthermore, the processing component 822 is configured to execute instructions to perform a method for demonstrating a vehicle-road cooperative scenario provided in an embodiment of this application.
[0198] Electronic device 800 may also include a power supply component 826 configured to perform power management of electronic device 800, a wired or wireless network interface 850 configured to connect electronic device 800 to a network, and an input / output (I / O) interface 858. Electronic device 800 may operate on an operating system stored in memory 832, such as Windows Server™, Mac OS X™, Unix™, Linux™, Free BSD™, or similar.
[0199] In embodiments of this application, memory 832 can be used to store software programs and various data. Memory 832 may primarily include a first storage area for storing programs or instructions and a second storage area for storing data. The first storage area may store the operating system, applications or instructions required for at least one function (such as sound playback, image playback, etc.). Furthermore, memory 832 may include volatile memory or non-volatile memory, or both. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be random access memory (RAM), static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous link dynamic random access memory (SLDRAM), and direct memory bus RAM (DRRAM). The memory 832 in the embodiments of this application includes, but is not limited to, these and any other suitable types of memory.
[0200] The processor may include one or more processing units; optionally, the processor integrates an application processor and a modem processor, wherein the application processor mainly handles operations related to the operating system, user interface, and applications, while the modem processor mainly handles wireless communication signals, such as a baseband processor. It is understood that the aforementioned modem processor may also not be integrated into the processor.
[0201] This application also provides a readable storage medium storing a program or instructions. When the program or instructions are executed by a processor, they implement the various processes of the above-described super-resolution reconstruction method for images and achieve the same technical effect. To avoid repetition, they will not be described again here.
[0202] The processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes computer-readable storage media, such as computer read-only memory (ROM), random access memory (RAM), magnetic disk, or optical disk.
[0203] This application also provides a computer program product, which is stored in a storage medium and executed by at least one processor to implement the various processes of the super-resolution reconstruction method embodiment of the image described above, and can achieve the same technical effect. To avoid repetition, it will not be described again here.
[0204] Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this application are indicated by the following claims.
[0205] It should be understood that this application is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this application is limited only by the appended claims.
Claims
1. A method for training a face recognition model, the method comprising: Obtain the initial training image; Based on the preset initial differentiable gated space transformation matrix, the initial training image is mapped to obtain the initial face alignment error sample. The initial differentiable gated space transformation matrix is a transformation matrix obtained based on differentiability and gated mechanism. Based on the initial face alignment error samples, backpropagation is performed on the preset initial face recognition network to obtain the gradient descent change value of the initial differentiable gated space transformation matrix. The initial differentiable gated space transformation matrix is updated based on the gradient descent change value using the gradient descent algorithm to generate target face alignment error samples. The initial face recognition network is updated using the target face alignment error samples to obtain the target face recognition model.
2. The method according to claim 1, wherein, The initial face alignment error samples are obtained by mapping the initial training image based on a preset initial differentiable gated space transformation matrix, including: The first image is obtained by remapping the pixel positions in the initial training image using an affine transformation matrix. The first image is inversely transformed using the initial differentiable gated spatial transformation matrix to obtain the second image. The initial differentiable gated spatial transformation matrix is obtained by setting gate coefficients on the affine transformation matrix. The second image is compared with the standard face template image to obtain the initial face alignment error sample.
3. The method according to claim 2, wherein, The step of comparing the second image with the standard face template image to obtain the initial face alignment error sample includes: The third image is obtained by interpolating the pixels in the second image using a bilinear interpolation algorithm. The third image is compared with the standard face template image to obtain the initial face alignment error sample.
4. The method according to claim 1, wherein, The backpropagation of the preset initial face recognition network based on the initial face alignment error samples to obtain the gradient descent change value of the initial differentiable gated space transformation matrix includes: The initial face alignment error samples are input into the initial face recognition network to obtain the initial loss function; The gradient descent change value is obtained by backpropagating the initial differentiable gated space transformation matrix through the initial loss function.
5. The method according to claim 1, wherein, The step of updating the initial differentiable gated space transformation matrix based on the gradient descent change value using the gradient descent algorithm to generate target face alignment error samples includes: The initial differentiable gated space transformation matrix is updated using the gradient descent algorithm based on the gradient descent change value, the preset step size, and the preset adversarial perturbation range, to obtain the target differentiable gated space transformation matrix. The initial training image is mapped using the target differentiable gated space transformation matrix to obtain the target face alignment error sample.
6. The method according to claim 5, wherein, The initial differentiable gated space transformation matrix is updated using the gradient descent algorithm based on the gradient descent change value, a preset step size, and a preset adversarial perturbation range to obtain the target differentiable gated space transformation matrix, including: According to the preset anti-disturbance range, the initial differentiable gated space transformation matrix is projected to obtain the first differentiable gated space transformation matrix; Based on the gradient descent change value and the preset step size, the first differentiable gated space transformation matrix is iteratively updated to obtain the target differentiable gated space transformation matrix.
7. The method according to claim 1, wherein, The step of updating the initial face recognition network using the target face alignment error samples to obtain the target face recognition model includes: The target face alignment error sample is input into the initial face recognition network to obtain the target loss function; The initial face recognition network is updated using the gradient descent algorithm based on the target loss function to obtain the target face recognition model.
8. The method according to claim 4, wherein, The step of backpropagating the preset initial face recognition network based on the initial face alignment error samples to obtain the gradient descent change value of the initial differentiable gated space transformation matrix further includes: The parameters of the initial face recognition network are frozen and backpropagation is performed to obtain the gradient descent change value of the initial differentiable gated space transformation matrix.
9. A face recognition method, the method comprising: Acquire the low-quality image to be identified; The low-quality image to be identified is input into the target face recognition model to obtain the output face recognition result. The target face recognition model is trained by the face recognition model training method according to any one of claims 1 to 8.
10. A training device for a face recognition model, the device comprising: The first acquisition module is used to acquire the initial training image; The image mapping module is used to map the initial training image based on a preset initial differentiable gated space transformation matrix to obtain initial face alignment error samples. The initial differentiable gated space transformation matrix is a transformation matrix obtained based on differentiability and gate mechanism. The backpropagation module is used to backpropagate the preset initial face recognition network based on the initial face alignment error samples to obtain the gradient descent change value of the initial differentiable gated space transformation matrix. The matrix update module is used to update the initial differentiable gated space transformation matrix based on the gradient descent change value using the gradient descent algorithm, thereby generating target face alignment error samples. The model update module is used to update the initial face recognition network using the target face alignment error samples to obtain the target face recognition model.
11. An application device for a face recognition model, the device comprising: The second acquisition module is used to acquire the low-quality image to be identified; The model application module is used to input the low-quality image to be recognized into the target face recognition model to obtain the output face recognition result. The target face recognition model is trained by the face recognition model training method according to any one of claims 1 to 8.
12. An electronic device, comprising: A processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the program, implements the method as described in any one of claims 1 to 9.
13. A readable storage medium, wherein instructions in the readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any one of claims 1 to 9.