A low-resolution face recognition method
By combining edge detection and GAN networks, low-resolution face images are identified and repaired, solving the problem of low accuracy in low-resolution face recognition and achieving higher recognition accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- FUJIAN JOYUSING TECHNOLOGY CO LTD
- Filing Date
- 2022-12-26
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies have low accuracy in low-resolution face recognition, especially in scenarios involving identity verification and face recognition of surveillance data, where the reduced image resolution leads to low recognition accuracy.
Face region images are extracted using a face detection algorithm. Edge detection operators are used to calculate image blurriness to determine if the image is blurry. Blurry images are repaired using a pre-trained super-resolution inpainting network to generate high-resolution images, which are then used for face recognition. Non-blurry images are directly input into the face recognition network. The super-resolution inpainting network uses a Generative Adversarial Network (GAN) and is iteratively trained and optimized through generators and discriminators.
It improves the accuracy of low-resolution face recognition by combining image reconstruction and direct recognition, thereby enhancing the recognition effect.
Smart Images

Figure CN116259087B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of machine learning and image recognition technology, specifically a low-resolution face recognition method. Background Technology
[0002] Current facial recognition technologies are all based on deep learning, using large amounts of facial data to train algorithm models. These models fall into categories such as classification-based, triplet-based, and comparison-based learning. Because the datasets contain a relatively high proportion of clear, high-resolution data, the accuracy is often low when comparing low-resolution images.
[0003] The low-resolution face recognition in this invention is mainly used in scenarios such as identity verification and face recognition based on surveillance data. In these scenarios, the image resolution of faces may decrease due to the limited capacity of the ID card chip, which compresses the evidence photo; or due to the long monitoring distance, the face image may be small, and the image resolution may decrease after magnification; or due to lens defocusing during portrait shooting, or motion blur caused by pedestrians moving relative to the lens, all of which degrade image quality. Existing face recognition methods have high accuracy for clear photos, but their accuracy is relatively low for low-resolution face images.
[0004] In the existing technology CN109886135A, a low-resolution face recognition method, device, and storage medium, a high-resolution face image is first interpolated and downsampled to obtain a low-resolution face image. Then, a high-resolution face reconstruction image and a face recognition network are constructed to obtain the high-resolution face reconstruction image and face recognition result, respectively. In the existing technology, low-resolution images obtained after processing high-resolution images are used for training. The blur level of the images input into the face recognition model cannot be determined, and image reconstruction is not always necessary. Therefore, it is necessary to first determine the blur level of the input image and determine whether image reconstruction is needed based on the determination result. Furthermore, the existing image reconstruction model and face recognition model are not optimal, so the recognition accuracy of low-resolution images is not high enough. Summary of the Invention
[0005] To address the problems existing in the prior art, this invention proposes a low-resolution face recognition method.
[0006] The technical solution of the present invention is as follows:
[0007] On the one hand, this invention proposes a low-resolution face recognition method, the specific steps of which include:
[0008] The input image is processed using a face detection algorithm to detect faces and obtain the face region image.
[0009] The horizontal and vertical gradients of the face region image are extracted using an edge detection operator, and a first image blur is calculated using these gradients. Gaussian blur is then applied to the face region, and the horizontal and vertical gradients of the blurred image are extracted again using the edge detection operator. A second image blur is calculated using these gradients. The ratio of the first image blur to the second image blur is compared with a pre-set threshold to determine whether the face region image is a blurred image.
[0010] If the face region image is a blurry image, the pre-trained super-resolution inpainting network is used to inpaint the face region image to generate a corresponding high-definition face image, which is then input into the pre-trained face recognition network for face recognition. If the face region image is not a blurry image, it is directly input into the pre-trained face recognition network for face recognition.
[0011] In a preferred embodiment, the pre-trained super-resolution inpainting network is a GAN network, including a generator network and a discriminator network. The pre-training steps of the super-resolution inpainting network are as follows:
[0012] Several high-definition face images are randomly collected, and the high-definition faces are blurred to obtain the corresponding blurred face images;
[0013] Several sets of feature images are obtained, each set of feature images includes a high-resolution face image and a corresponding blurred face image, and high-resolution and blurred labels are added to the high-resolution face image and the corresponding blurred face image respectively to form a training sample set;
[0014] The corresponding blurred face image from the feature image is input into the generator network, which outputs a restored high-resolution face image. The restored high-resolution face image is then input into the adversarial network and compared with the original high-resolution face image to determine the authenticity of the image. Based on the judgment result, the parameters of the generator network are adjusted in reverse until the judgment result of the restored high-resolution face image input into the adversarial network is true. Then, the parameters of the adversarial network are adjusted to adjust the accuracy. The goal is to minimize the difference between the restored high-resolution face image and the high-resolution face image in the feature image. Iterative training is performed, and the adversarial network and generator network are trained repeatedly. The iteration ends and the optimal super-resolution restoration network is output.
[0015] In a preferred embodiment, the step of blurring the high-definition face to obtain the corresponding blurred face image specifically includes:
[0016] Simulate a scenario of small image magnification: first shrink the acquired high-definition face image, and then enlarge it using a quadratic interpolation method;
[0017] Simulate a defocused and blurred scene: Apply Gaussian blur to the acquired high-resolution face image;
[0018] Simulate motion blur scene: Perform two-dimensional filtering on the acquired high-definition face image using a dynamic blur kernel.
[0019] In a preferred embodiment, the loss function of the GAN network includes the reconstruction loss function in the generator network and the adversarial loss function in the adversarial network:
[0020] The reconstruction loss function is the average loss error between the high-resolution face image restored by the generator and the high-resolution face image in the feature image. The specific formula is as follows:
[0021]
[0022] In the formula, I SR For high-resolution face images in the feature image, I LR For the corresponding blurred face image, W is the image width, H is the image height, and G() is the high-resolution face image restored by the generator;
[0023] The adversarial loss function is the error between the ground truth label of the high-resolution face image restored by the generator and the high-resolution face image in the feature image, where the high-resolution face image in the feature image is labeled as "1". The specific formula is as follows:
[0024]
[0025] In the formula, N is the number of training data in a single batch; I SR For high-resolution face images in the feature image, I LR G represents the corresponding blurred face image; G is the generator model, and D is the discriminator model.
[0026] The reconstruction loss and the adversarial loss together constitute the overall loss function of the model, as shown in the following formula:
[0027]
[0028] In the formula, To reconstruct the loss coefficient, To reconstruct the loss function, To counteract the loss coefficient, To counteract the loss function.
[0029] In a preferred embodiment, the pre-training step of the face recognition network specifically includes:
[0030] Several high-resolution face images are acquired and normalized. Then, the high-resolution face images are reduced in size and then enlarged, Gaussian blurred, and motion blurred. The processed images are then subjected to blur judgment. Images that are judged to be blurry are input into the trained super-resolution restoration model for restoration, and a simulated image is output.
[0031] Acquire feature images, which include simulated images and high-definition face images, and add generated image and actual image labels to the simulated images and high-definition face images respectively to form a training sample set;
[0032] A face recognition network is established, which is a convolutional neural network. The simulated image in the feature image is taken as input, and the output is a high-definition face image corresponding to the identity of the simulated image. Iterative training is performed with the goal of minimizing the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image. A loss function is constructed to measure the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image, and the optimal face recognition model is output.
[0033] The expression for the above loss function is as follows:
[0034]
[0035] In the formula, N is the batch normalization coefficient, s is the scale factor coefficient, θ is the angle between parameter x and weight W; m is the margin coefficient, and n is the number of classification categories.
[0036] On the other hand, this invention proposes a low-resolution face recognition system, the specific steps of which include:
[0037] Face capture module: Uses face detection algorithms to detect faces in the input image and obtains face region images;
[0038] Face blur determination module: Extracts the horizontal and vertical gradients of the face region image using an edge detection operator, and calculates the first image blur using these gradients; applies Gaussian blur to the face region, and extracts the horizontal and vertical gradients of the blurred image again using an edge detection operator, and calculates the second image blur using these gradients; determines whether the face region image is a blurred image by comparing the ratio of the first image blur to the second image blur with a preset threshold.
[0039] Face restoration module: If the face region image is a blurry image, the pre-trained super-resolution restoration network is used to restore the face region image and generate the corresponding high-definition face image.
[0040] Face recognition module: Input the face image that is not a blurred image or the corresponding high-definition face image generated after being repaired by a pre-trained super-resolution inpainting network into the pre-trained face recognition network for face recognition.
[0041] In a preferred embodiment, the pre-trained super-resolution inpainting network is a GAN network, including a generator network and a discriminator network. The pre-training steps of the super-resolution inpainting network are as follows:
[0042] Several high-definition face images are randomly collected, and the high-definition faces are blurred to obtain the corresponding blurred face images;
[0043] Several sets of feature images are obtained, each set of feature images includes a high-resolution face image and a corresponding blurred face image, and high-resolution and blurred labels are added to the high-resolution face image and the corresponding blurred face image respectively to form a training sample set;
[0044] The corresponding blurred face image from the feature image is input into the generator network, which outputs a restored high-resolution face image. The restored high-resolution face image is then input into the adversarial network and compared with the original high-resolution face image to determine the authenticity of the image. Based on the judgment result, the parameters of the generator network are adjusted in reverse until the judgment result of the restored high-resolution face image input into the adversarial network is true. Then, the parameters of the adversarial network are adjusted to adjust the accuracy. The goal is to minimize the difference between the restored high-resolution face image and the high-resolution face image in the feature image. Iterative training is performed, and the adversarial network and generator network are trained repeatedly. The iteration ends and the optimal super-resolution restoration network is output.
[0045] In a preferred embodiment, the step of blurring the high-definition face to obtain the corresponding blurred face image specifically includes:
[0046] Simulate a scenario of small image magnification: first shrink the acquired high-definition face image, and then enlarge it using a quadratic interpolation method;
[0047] Simulate a defocused and blurred scene: Apply Gaussian blur to the acquired high-resolution face image;
[0048] Simulate motion blur scene: Perform two-dimensional filtering on the acquired high-definition face image using a dynamic blur kernel.
[0049] In a preferred embodiment, the loss function of the GAN network includes the reconstruction loss function in the generator network and the adversarial loss function in the adversarial network:
[0050] The reconstruction loss function is the average loss error between the high-resolution face image restored by the generator and the high-resolution face image in the feature image. The specific formula is as follows:
[0051]
[0052] In the formula, I SR For high-resolution face images in the feature image, I LR For the corresponding blurred face image, W is the image width, H is the image height, and G() is the high-resolution face image restored by the generator;
[0053] The adversarial loss function is the error between the ground truth label of the high-resolution face image restored by the generator and the high-resolution face image in the feature image, where the high-resolution face image in the feature image is labeled as "1". The specific formula is as follows:
[0054]
[0055] In the formula, N is the number of training data in a single batch; I SR For high-resolution face images in the feature image, I LR G represents the corresponding blurred face image; G is the generator model, and D is the discriminator model.
[0056] The reconstruction loss and the adversarial loss together constitute the overall loss function of the model, as shown in the following formula:
[0057]
[0058] In the formula, To reconstruct the loss coefficient, To reconstruct the loss function, To counteract the loss coefficient, To counteract the loss function.
[0059] In a preferred embodiment, the pre-training step of the face recognition network specifically includes:
[0060] Several high-resolution face images are acquired and normalized. Then, the high-resolution face images are reduced in size and then enlarged, Gaussian blurred, and motion blurred. The processed images are then subjected to blur judgment. Images that are judged to be blurry are input into the trained super-resolution restoration model for restoration, and a simulated image is output.
[0061] Acquire feature images, which include simulated images and high-definition face images, and add generated image and actual image labels to the simulated images and high-definition face images respectively to form a training sample set;
[0062] A face recognition network is established, which is a convolutional neural network. The simulated image in the feature image is taken as input, and the output is a high-definition face image corresponding to the identity of the simulated image. Iterative training is performed with the goal of minimizing the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image. A loss function is constructed to measure the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image, and the optimal face recognition model is output.
[0063] The expression for the above loss function is as follows:
[0064]
[0065] In the formula, N is the batch normalization coefficient, s is the scale factor coefficient, θ is the angle between parameter x and weight W; m is the margin coefficient, and n is the number of classification categories.
[0066] The present invention has the following beneficial effects:
[0067] 1. This invention performs fuzziness assessment on the acquired face images to determine the degree of fuzziness of the acquired images. Based on the degree of fuzziness, it determines whether image reconstruction is required. Then, face recognition is performed on the reconstructed images. Images that do not require image reconstruction can be directly recognized.
[0068] 2. This invention utilizes a GAN network to reconstruct images from acquired images that are judged to be blurry. The generator network of the GAN network first generates a reconstructed image, and then inputs the reconstructed image into the judgment network for authenticity judgment. The effect of the generator network in generating the reconstructed image is adjusted based on the backpropagation of the judgment result. Through continuous iteration, the reconstructed image generated by the generator network is made to be as close as possible to the real image.
[0069] 3. This invention provides a face recognition network model and constructs a new loss function, which can achieve higher recognition accuracy. Attached Figure Description
[0070] Figure 1 This is a flowchart of the present invention. Detailed Implementation
[0071] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0072] It should be understood that the step numbers used in the text are for ease of description only and are not intended to limit the order in which the steps are performed.
[0073] It should be understood that the terminology used in this specification is for the purpose of describing particular embodiments only and is not intended to limit the invention. As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms unless the context clearly indicates otherwise.
[0074] The terms “comprising” and “including” indicate the presence of the described feature, whole, step, operation, element and / or component, but do not exclude the presence or addition of one or more other features, wholes, steps, operations, elements, components and / or collections thereof.
[0075] The term “and / or” refers to any combination of one or more of the associated listed items, as well as all possible combinations, and includes these combinations.
[0076] Example 1:
[0077] See Figure 1 A low-resolution face recognition method, the specific steps of which include:
[0078] The input image is processed using a face detection algorithm to detect faces and obtain the face region image.
[0079] In this specific implementation, the Dlib face detection algorithm is used for face detection.
[0080] The horizontal and vertical gradients of the face region image are extracted using an edge detection operator, and a first image blur is calculated using these gradients. The image is then blurred using Gaussian blur, and the horizontal and vertical gradients of the blurred image are extracted again using the edge detection operator. A second image blur is calculated using these gradients. The ratio of the first image blur to the second image blur is compared with a pre-set threshold to determine whether the face region image is a blurred image.
[0081] In practice, this embodiment uses the Sobel operator to extract the horizontal and vertical gradients of the face region image.
[0082] If the face region image is a blurry image, the pre-trained super-resolution inpainting network is used to inpaint the face region image to generate a corresponding high-definition face image, which is then input into the pre-trained face recognition network for face recognition. If the face region image is not a blurry image, it is directly input into the pre-trained face recognition network for face recognition.
[0083] In a preferred embodiment of this invention, the pre-trained super-resolution inpainting network is a GAN network, comprising a generator network and a discriminator network. The pre-training steps of the super-resolution inpainting network are as follows:
[0084] Several high-definition face images are randomly collected, and the high-definition faces are blurred to obtain the corresponding blurred face images;
[0085] In specific implementation, this embodiment uses a high-definition face dataset to simulate three low-resolution image scenarios for data preprocessing: 1. Simulating a small image magnification scenario, the data is reduced to 56*56 and then magnified to 112*112 using quadratic interpolation; 2. Simulating a defocused blur scenario, the data is processed using Gaussian blur; 3. Simulating a motion blur scenario, the data is processed using a dynamic blur kernel for two-dimensional filtering.
[0086] Several sets of feature images are obtained, each set of feature images includes a high-resolution face image and a corresponding blurred face image, and high-resolution and blurred labels are added to the high-resolution face image and the corresponding blurred face image respectively to form a training sample set;
[0087] In practice, high-definition data and processed fuzzy data are paired together, and then the data pairs are divided into training set, validation set and test set in a ratio of 8:1:1.
[0088] The corresponding blurred face image from the feature image is input into the generator network, which outputs a restored high-resolution face image. The restored high-resolution face image is then input into the adversarial network and compared with the original high-resolution face image to determine the authenticity of the image. Based on the judgment result, the parameters of the generator network are adjusted in reverse until the judgment result of the restored high-resolution face image input into the adversarial network is true. Then, the parameters of the adversarial network are adjusted to adjust the accuracy. The goal is to minimize the difference between the restored high-resolution face image and the high-resolution face image in the feature image. Iterative training is performed, and the adversarial network and generator network are trained repeatedly. The iteration ends and the optimal super-resolution restoration network is output.
[0089] In specific implementation, the generator model and discriminator model structures used in this embodiment are as follows:
[0090] The generator model structure is as follows:
[0091] INPUT->Conv1->Block1->Block2->Block3->Block4->OUTPUT
[0092] The input layer is INPUT, which contains blurred image data with dimensions of 112*112*3.
[0093] Convolutional layer Conv1, with dimensions of 9*9*64 and a stride of 1;
[0094] The residual module Block1 has dimensions of 3*3*96 and a step size of 1.
[0095] The residual module Block2 has dimensions of 3*3*96 and a step size of 1.
[0096] The residual module Block3 has dimensions of 3*3*96 and a step size of 1.
[0097] The residual module Block4 has dimensions of 3*3*96 and a step size of 1.
[0098] Output layer OUTPUT;
[0099] The residual module Block structure is as follows:
[0100] INPUT->Conv1->BatchNorm1->Prelu->Conv2->BatchNorm2->ElementwiseSum->OUTPUT
[0101] Input layer INPUT;
[0102] Convolutional layer Conv1, with dimensions of 3*3*96;
[0103] Batch normalization layer BatchNorm1;
[0104] Activation function Prelude;
[0105] Convolutional layer Conv2, with dimensions of 3*3*96;
[0106] Batch normalization layer BatchNorm2;
[0107] The bitwise addition operation is ElementwiseSum(input, batchnorm2);
[0108] Output layer;
[0109] The discriminator model structure is as follows:
[0110] INPUT->Conv1->Conv2->Conv3->Conv4->Dense1->Dense2->OUTPUT
[0111] The input layer INPUT has a size of 112*112*3.
[0112] Convolutional layer Conv1 has a size of 11*11*96 and a stride of 1.
[0113] Convolutional layer Conv2, with dimensions of 5*5*128 and a stride of 1;
[0114] Convolutional layer Conv3, with dimensions of 3*3*128 and a stride of 1;
[0115] Convolutional layer Conv4, with dimensions of 3*3*64 and a stride of 1;
[0116] The fully connected layer Dense1 has a size of 1024*1.
[0117] The fully connected layer Dense2 has a size of 1*1.
[0118] Output layer OUTPUT.
[0119] In a preferred embodiment of this invention, the step of blurring the high-definition face to obtain the corresponding blurred face image specifically includes:
[0120] Simulate a scenario of small image magnification: first shrink the acquired high-definition face image, and then enlarge it using a quadratic interpolation method;
[0121] Simulate a defocused and blurred scene: Apply Gaussian blur to the acquired high-resolution face image;
[0122] Simulate motion blur scene: Perform two-dimensional filtering on the acquired high-definition face image using a dynamic blur kernel.
[0123] In a preferred embodiment of this invention, the loss function of the GAN network includes the reconstruction loss function in the generator network and the adversarial loss function in the adversarial network:
[0124] The reconstruction loss function is the average loss error between the high-resolution face image restored by the generator and the high-resolution face image in the feature image. The specific formula is as follows:
[0125]
[0126] In the formula, I SR For high-resolution face images in the feature image, I LR For the corresponding blurred face image, W is the image width, H is the image height, and G() is the high-resolution face image restored by the generator;
[0127] The adversarial loss function is the error between the ground truth label of the high-resolution face image restored by the generator and the high-resolution face image in the feature image, where the high-resolution face image in the feature image is labeled as "1". The specific formula is as follows:
[0128]
[0129] In the formula, N is the number of training data in a single batch; ISR For high-resolution face images in the feature image, I LR G represents the corresponding blurred face image; G is the generator model, and D is the discriminator model.
[0130] The reconstruction loss and the adversarial loss together constitute the overall loss function of the model, as shown in the following formula:
[0131]
[0132] In the formula, To reconstruct the loss coefficient, To reconstruct the loss function, To counteract the loss coefficient, To counteract the loss function.
[0133] In a preferred embodiment of this invention, the pre-training step of the face recognition network specifically includes:
[0134] Several high-resolution face images are acquired and normalized. Then, the high-resolution face images are reduced in size and then enlarged, Gaussian blurred, and motion blurred. The processed images are then subjected to blur judgment. Images that are judged to be blurry are input into the trained super-resolution restoration model for restoration, and a simulated image is output.
[0135] In specific implementation, this embodiment normalizes the face image size to 112*112*3, where 30% of the data is reduced in size and then enlarged, Gaussian blurred, and motion blurred.
[0136] Acquire feature images, which include simulated images and high-definition face images, and add generated image and actual image labels to the simulated images and high-definition face images respectively to form a training sample set;
[0137] In practice, the processed images are mixed and shuffled with other images, and divided into training set, test set and validation set in a ratio of 8:1:1.
[0138] A face recognition network is established, which is a convolutional neural network. The simulated image in the feature image is taken as input, and the output is a high-definition face image corresponding to the identity of the simulated image. Iterative training is performed with the goal of minimizing the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image. A loss function is constructed to measure the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image, and the optimal face recognition model is output.
[0139] The expression for the above loss function is as follows:
[0140]
[0141] In the formula, N is the batch normalization coefficient, s is the scale factor coefficient, θ is the angle between parameter x and weight W; m is the margin coefficient, and n is the number of classification categories.
[0142] In specific implementation, the face recognition model used in this embodiment employs a convolutional neural network, including an input layer, a hidden layer, and an output layer, with the following specific structure:
[0143] INPUT->ResNet18->Dense->OUTPUT
[0144] INPUT is the input layer, and the input image size is 112*112*3;
[0145] ResNet18 has 17 convolutional layers and 1 fully connected layer with hidden layers;
[0146] Dense is a fully connected layer with a size of 3*3*512;
[0147] The OUTPUT output layer has an output size of 1*1*85172, where 85172 represents the number of categories in the dataset.
[0148] Furthermore, in the loss function, N is the batch normalization coefficient, set to 64; s is the scaling factor coefficient, set to 30; and m is the interval coefficient, set to 0.35.
[0149] Example 2:
[0150] A low-resolution face recognition system, comprising the following steps:
[0151] Face capture module: Uses face detection algorithms to detect faces in the input image and obtains face region images;
[0152] Face blur determination module: Extracts the horizontal and vertical gradients of the face region image using an edge detection operator, and calculates the first image blur using these gradients; applies Gaussian blur to the face region, and extracts the horizontal and vertical gradients of the blurred image again using an edge detection operator, and calculates the second image blur using these gradients; determines whether the face region image is a blurred image by comparing the ratio of the first image blur to the second image blur with a preset threshold.
[0153] Face restoration module: If the face region image is a blurry image, the pre-trained super-resolution restoration network is used to restore the face region image and generate the corresponding high-definition face image.
[0154] Face recognition module: Input the face image that is not a blurred image or the corresponding high-definition face image generated after being repaired by a pre-trained super-resolution inpainting network into the pre-trained face recognition network for face recognition;
[0155] In a preferred embodiment of this invention, the pre-trained super-resolution inpainting network is a GAN network, comprising a generator network and a discriminator network. The pre-training steps of the super-resolution inpainting network are as follows:
[0156] Several high-definition face images are randomly collected, and the high-definition faces are blurred to obtain the corresponding blurred face images;
[0157] Several sets of feature images are obtained, each set of feature images includes a high-resolution face image and a corresponding blurred face image, and high-resolution and blurred labels are added to the high-resolution face image and the corresponding blurred face image respectively to form a training sample set;
[0158] The corresponding blurred face image from the feature image is input into the generator network, which outputs a restored high-resolution face image. The restored high-resolution face image is then input into the adversarial network and compared with the original high-resolution face image to determine the authenticity of the image. Based on the judgment result, the parameters of the generator network are adjusted in reverse until the judgment result of the restored high-resolution face image input into the adversarial network is true. Then, the parameters of the adversarial network are adjusted to adjust the accuracy. The goal is to minimize the difference between the restored high-resolution face image and the high-resolution face image in the feature image. Iterative training is performed, and the adversarial network and generator network are trained repeatedly. The iteration ends and the optimal super-resolution restoration network is output.
[0159] In a preferred embodiment of this invention, the step of blurring the high-definition face to obtain the corresponding blurred face image specifically includes:
[0160] Simulate a scenario of small image magnification: first shrink the acquired high-definition face image, and then enlarge it using a quadratic interpolation method;
[0161] Simulate a defocused and blurred scene: Apply Gaussian blur to the acquired high-resolution face image;
[0162] Simulate motion blur scene: Perform two-dimensional filtering on the acquired high-definition face image using a dynamic blur kernel.
[0163] In a preferred embodiment of this invention, the loss function of the GAN network includes the reconstruction loss function in the generator network and the adversarial loss function in the adversarial network:
[0164] The reconstruction loss function is the average loss error between the high-resolution face image restored by the generator and the high-resolution face image in the feature image. The specific formula is as follows:
[0165]
[0166] In the formula, I SR For high-resolution face images in the feature image, I LR For the corresponding blurred face image, W is the image width, H is the image height, and G() is the high-resolution face image restored by the generator;
[0167] The adversarial loss function is the error between the ground truth label of the high-resolution face image restored by the generator and the high-resolution face image in the feature image, where the high-resolution face image in the feature image is labeled as "1". The specific formula is as follows:
[0168]
[0169] In the formula, N is the number of training data in a single batch; I SR For high-resolution face images in the feature image, I LR G represents the corresponding blurred face image; G is the generator model, and D is the discriminator model.
[0170] The reconstruction loss and the adversarial loss together constitute the overall loss function of the model, as shown in the following formula:
[0171]
[0172] In the formula, To reconstruct the loss coefficient, To reconstruct the loss function, To counteract the loss coefficient, To counteract the loss function.
[0173] In a preferred embodiment of this invention, the pre-training step of the face recognition network specifically includes:
[0174] Several high-resolution face images are acquired and normalized. Then, the high-resolution face images are reduced in size and then enlarged, Gaussian blurred, and motion blurred. The processed images are then subjected to blur judgment. Images that are judged to be blurry are input into the trained super-resolution restoration model for restoration, and a simulated image is output.
[0175] Acquire feature images, which include simulated images and high-definition face images, and add generated image and actual image labels to the simulated images and high-definition face images respectively to form a training sample set;
[0176] A face recognition network is established, which is a convolutional neural network. The simulated image in the feature image is taken as input, and the output is a high-definition face image corresponding to the identity of the simulated image. Iterative training is performed with the goal of minimizing the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image. A loss function is constructed to measure the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image, and the optimal face recognition model is output.
[0177] The expression for the above loss function is as follows:
[0178]
[0179] In the formula, N is the batch normalization coefficient, s is the scale factor coefficient, θ is the angle between parameter x and weight W; m is the margin coefficient, and n is the number of classification categories.
[0180] The above description is merely an embodiment of the present invention and does not limit the patent scope of the present invention. Any equivalent structural or procedural transformations made based on the content of the present invention's specification and drawings, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of the present invention.
Claims
1. A low-resolution face recognition method, characterized by, The specific steps include: The input image is processed using a face detection algorithm to detect faces and obtain the face region image. The horizontal and vertical gradients of the face region image are extracted using an edge detection operator, and a first image blur is calculated using these gradients. Gaussian blur is then applied to the face region, and the horizontal and vertical gradients of the blurred image are extracted again using the edge detection operator. A second image blur is calculated using these gradients. The ratio of the first image blur to the second image blur is compared with a pre-set threshold to determine whether the face region image is a blurred image. If the face region image is a blurry image, the pre-trained super-resolution inpainting network is used to inpaint the face region image to generate a corresponding high-definition face image, and then the high-definition face image is input into the pre-trained face recognition network for face recognition; if the face region image is not a blurry image, it is directly input into the pre-trained face recognition network for face recognition. The pre-training steps of the face recognition network are as follows: Several high-resolution face images are acquired and normalized. Then, the high-resolution face images are reduced in size and then enlarged, Gaussian blurred, and motion blurred. The processed images are then subjected to blur judgment. Images that are judged to be blurry are input into the trained super-resolution restoration model for restoration, and a simulated image is output. Acquire feature images, which include simulated images and high-definition face images, and add generated image and actual image labels to the simulated images and high-definition face images respectively to form a training sample set; A face recognition network is established, which is a convolutional neural network. The simulated image in the feature image is taken as input, and the output is a high-definition face image corresponding to the identity of the simulated image. Iterative training is performed with the goal of minimizing the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image. A loss function is constructed to measure the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image, and the optimal face recognition model is output. The expression for the above loss function is as follows: In the formula, N is the batch normalization coefficient, s is the scale factor coefficient, θ is the angle between parameter x and weight W; m is the margin coefficient, and n is the number of classification categories.
2. The low-resolution face recognition method of claim 1, wherein, The pre-trained super-resolution inpainting network is a GAN network, including a generator network and a discriminator network. The specific pre-training steps of the super-resolution inpainting network are as follows: Several high-definition face images are randomly collected, and the high-definition faces are blurred to obtain the corresponding blurred face images; Several sets of feature images are obtained, each set of feature images includes a high-resolution face image and a corresponding blurred face image, and high-resolution and blurred labels are added to the high-resolution face image and the corresponding blurred face image respectively to form a training sample set; The corresponding blurred face image from the feature image is input into the generator network, which outputs a restored high-resolution face image. The restored high-resolution face image is then input into the adversarial network and compared with the original high-resolution face image to determine the authenticity of the image. Based on the judgment result, the parameters of the generator network are adjusted in reverse until the judgment result of the restored high-resolution face image input into the adversarial network is true. Then, the parameters of the adversarial network are adjusted to adjust the accuracy. The goal is to minimize the difference between the restored high-resolution face image and the high-resolution face image in the feature image. Iterative training is performed, and the adversarial network and generator network are trained repeatedly. The iteration ends and the optimal super-resolution restoration network is output.
3. The low-resolution face recognition method according to claim 2, characterized in that, The step of blurring a high-definition face to obtain a corresponding blurred face image specifically includes: Simulate a scenario of small image magnification: first shrink the acquired high-definition face image, and then enlarge it using a quadratic interpolation method; Simulate a defocused and blurred scene: Apply Gaussian blur to the acquired high-resolution face image; Simulate motion blur scene: Perform two-dimensional filtering on the acquired high-definition face image using a dynamic blur kernel.
4. The low-resolution face recognition method according to claim 2, characterized in that, The loss function of the GAN network includes the reconstruction loss function in the generator network and the adversarial loss function in the adversarial network: The reconstruction loss function is the average loss error between the high-resolution face image restored by the generator and the high-resolution face image in the feature image. The specific formula is as follows: In the formula, I SR For high-resolution face images in the feature image, I LR is the corresponding blurred face image; W is the image width, H is the image height, and G() is the high-resolution face image restored by the generator; The adversarial loss function is the error between the ground truth labeling of the high-resolution face image restored by the generator and the high-resolution face image in the feature image, where the high-resolution face image in the feature image is labeled as "1". The specific formula is as follows: In the formula, N is the number of training data in a single batch; I SR For high-resolution face images in the feature image, I LR G represents the corresponding blurred face image; G is the generator model, and D is the discriminator model. The reconstruction loss and the adversarial loss together constitute the overall loss function of the model, as shown in the following formula: In the formula, To reconstruct the loss coefficient, To reconstruct the loss function, To counteract the loss coefficient, To counteract the loss function.
5. A low-resolution face recognition system, characterized in that, The specific steps include: Face capture module: Uses face detection algorithms to detect faces in the input image and obtains face region images; Face blur determination module: Extracts the horizontal and vertical gradients of the face region image using an edge detection operator, and calculates the first image blur using these gradients; applies Gaussian blur to the face region, and extracts the horizontal and vertical gradients of the blurred image again using an edge detection operator, and calculates the second image blur using these gradients; determines whether the face region image is a blurred image by comparing the ratio of the first image blur to the second image blur with a preset threshold. Face restoration module: If the face region image is a blurry image, the pre-trained super-resolution restoration network is used to restore the face region image and generate the corresponding high-definition face image. Face recognition module: Input the face image that is not a blurred image or the corresponding high-definition face image generated after being repaired by a pre-trained super-resolution inpainting network into the pre-trained face recognition network for face recognition; The pre-training steps of the face recognition network are as follows: Several high-resolution face images are acquired and normalized. Then, the high-resolution face images are reduced in size and then enlarged, Gaussian blurred, and motion blurred. The processed images are then subjected to blur judgment. Images that are judged to be blurry are input into the trained super-resolution restoration model for restoration, and a simulated image is output. Acquire feature images, which include simulated images and high-definition face images, and add generated image and actual image labels to the simulated images and high-definition face images respectively to form a training sample set; A face recognition network is established, which is a convolutional neural network. The simulated image in the feature image is taken as input, and the output is a high-definition face image corresponding to the identity of the simulated image. Iterative training is performed with the goal of minimizing the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image. A loss function is constructed to measure the difference between the simulated image and the high-definition face image corresponding to the identity of the simulated image, and the optimal face recognition model is output. The expression for the above loss function is as follows: In the formula, N is the batch normalization coefficient, s is the scale factor coefficient, θ is the angle between parameter x and weight W; m is the margin coefficient, and n is the number of classification categories.
6. The low-resolution face recognition system of claim 5, wherein, The pre-trained super-resolution inpainting network is a GAN network, including a generator network and a discriminator network. The specific pre-training steps of the super-resolution inpainting network are as follows: Several high-definition face images are randomly collected, and the high-definition faces are blurred to obtain the corresponding blurred face images; Several sets of feature images are obtained, each set of feature images includes a high-resolution face image and a corresponding blurred face image, and high-resolution and blurred labels are added to the high-resolution face image and the corresponding blurred face image respectively to form a training sample set; The corresponding blurred face image from the feature image is input into the generator network, which outputs a restored high-resolution face image. The restored high-resolution face image is then input into the adversarial network and compared with the original high-resolution face image to determine the authenticity of the image. Based on the judgment result, the parameters of the generator network are adjusted in reverse until the judgment result of the restored high-resolution face image input into the adversarial network is true. Then, the parameters of the adversarial network are adjusted to adjust the accuracy. The goal is to minimize the difference between the restored high-resolution face image and the high-resolution face image in the feature image. Iterative training is performed, and the adversarial network and generator network are trained repeatedly. The iteration ends and the optimal super-resolution restoration network is output.
7. The low-resolution face recognition system of claim 6, wherein, The step of blurring a high-definition face to obtain a corresponding blurred face image specifically includes: Simulate a scenario of small image magnification: first shrink the acquired high-definition face image, and then enlarge it using a quadratic interpolation method; Simulate a defocused and blurred scene: Apply Gaussian blur to the acquired high-resolution face image; Simulate motion blur scene: Perform two-dimensional filtering on the acquired high-definition face image using a dynamic blur kernel.
8. The low-resolution face recognition system of claim 6, wherein, The loss function of the GAN network includes the reconstruction loss function in the generator network and the adversarial loss function in the adversarial network: The reconstruction loss function is the average loss error between the high-resolution face image restored by the generator and the high-resolution face image in the feature image. The specific formula is as follows: In the formula, I SR For high-resolution face images in the feature image, I LR For the corresponding blurred face image, W is the image width, H is the image height, and G() is the high-resolution face image restored by the generator; The adversarial loss function is the error between the ground truth labeling of the high-resolution face image restored by the generator and the high-resolution face image in the feature image, where the high-resolution face image in the feature image is labeled as "1". The specific formula is as follows: In the formula, N is the number of training data in a single batch; I SR For high-resolution face images in the feature image, I LR G represents the corresponding blurred face image; G is the generator model, and D is the discriminator model. The reconstruction loss and the adversarial loss together constitute the overall loss function of the model, as shown in the following formula: In the formula, To reconstruct the loss coefficient, To reconstruct the loss function, To counteract the loss coefficient, To counteract the loss function.