Face illumination processing method based on Retinex decomposition and generative adversarial network

A processing method and network technology, applied in the field of image processing and pattern recognition, can solve the problems of poor local shadow processing effect, easy distortion and distortion of face images, and inability to recognize face images.

Active Publication Date: 2020-12-25
SOUTHEAST UNIV
3 Cites 3 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0004] (1) The performance is not good under harsh lighting conditions at night, the processing effect on local shadows is poor, and the lighting effect cannot be completely eliminated
[0005] (2) The processed face image is prone to distortion and dist...
View more

Abstract

The invention discloses a human face illumination processing method based on Retinex decomposition and a generative adversarial network. A framework comprises an illumination decomposition module, a human face reconstruction module, a discriminator module and a human face verification module. The illumination decomposition module is composed of a convolutional neural network, inputs a pair of faceimages, and decomposes the face images into reflection components and illumination components through unsupervised learning; the face reconstruction module is composed of a coding and decoding convolutional neural network, inputs a reflection component, an illumination component and a target illumination level label containing a low-illumination face image, and can adjust the illumination component of the low-illumination image to a target illumination level; the discriminator module discriminates the authenticity of an input face image through adversarial learning and classifies illuminationlevels; the face verification module comprises a pre-trained face classifier to ensure that the generated face image and the target face image have the same identity information. The method is high in robustness and good in face reconstruction effect, and can be suitable for face illumination processing under the condition of dark illumination at night.

Application Domain

Character and pattern recognitionNeural architectures +1

Technology Topic

Computer visionGenerative adversarial network +5

Image

  • Face illumination processing method based on Retinex decomposition and generative adversarial network
  • Face illumination processing method based on Retinex decomposition and generative adversarial network
  • Face illumination processing method based on Retinex decomposition and generative adversarial network

Examples

  • Experimental program(1)

Example Embodiment

[0079] Below in conjunction with accompanying drawing and specific embodiment the present invention is described in further detail:
[0080] The invention provides a face illumination processing method based on Retinex decomposition and generation confrontation network. The model proposed by the method includes an illumination decomposition module, a face reconstruction module, a discriminator module and a face verification module. Among them, the illumination decomposition module extracts the reflection component and illumination component of the face image; the face reconstruction module adjusts the illumination level of the input face image; the discriminator module uses generative confrontation learning to ensure the authenticity of the synthesized face image; the face verification module retains Identity information for synthetic face images.
[0081] Refer to the specific process figure 1 As shown, this embodiment provides a face illumination processing method based on Retinex decomposition and generation confrontation network, and the specific implementation steps are as follows:
[0082] Step 1: Create a face lighting processing dataset, using the CAS-PEAL dataset published by the Institute of Computing Technology, Chinese Academy of Sciences. like figure 2 As shown, the data set contains a total of 1666 face pictures under 10 lighting conditions, of which 1498 face pictures (180 face instances) are training samples, and the remaining 198 face pictures (20 face instances) are test samples .
[0083] Step 2: Build the light decomposition module. This module consists of a convolutional neural network, which is used for Retinex illumination decomposition of face images, and outputs the reflection component and illumination component of the face image;
[0084] Step 201: For a given input face image S in and the target face image S tar , the input of the illumination decomposition module is the image pair {S in ,S tar}. The module consists of 6 convolutional layers, where the first convolutional layer uses a 9×9 convolutional kernel to learn the global information of the face image. The remaining convolutional layers use 3×3 convolutional kernels and ReLU activation functions. Finally, the Sigmoid activation function is used to normalize the pixel values ​​of the reflection component R and illumination component I output by the network to the [0,1] interval. The network does not contain a pooling layer and the step size of the convolution operation is 1 to ensure that the size of the reflection component R and the illumination component I is consistent with the size of the input image S. The operation of the light decomposition module can be specifically expressed as:
[0085] R in , I in , R tar , I tar =Dec(S in ,S tar ) (1)
[0086] Where Dec(·) represents the light decomposition module, R in , I in , R tar , I tar For the input face image S in and the target face image S tar Retinex light decomposition results.
[0087] Step 202: The illumination decomposition module performs unsupervised learning through the intrinsic constraints of face image pairs, and its objective function consists of the following parts.
[0088] (1) Consistency loss of reflection components: According to the Retinex theory, the reflection components of the input image and the target image are approximately consistent, and the main difference is reflected in the illumination component. The reflection consistency loss is used to constrain the reflection component R of the input image in and the reflection component R of the target image tar The distance between , its loss function can be specifically expressed as:
[0089]
[0090] in Represents the consistency loss of the reflection component of the illumination decomposition module, and the L1 norm distance is used to measure the R in and R tar degree of similarity.
[0091] (2) Pixel regression loss: the reflection component {R in , R tar} and the light component {I in , I tar}The input face image and the target face image can be reconstructed by multiplying matrix elements, and the loss function can be defined as:
[0092]
[0093] in Represents the pixel regression loss of the illumination decomposition module, involving the input image, the target image and the cross-reconstructed image, α ij Weights pixel regression loss for different images.
[0094] (3) Smoothing loss: the full variational model can be used to smooth the illumination component output by the illumination decomposition module {I in , I tar} and filter the noise, its loss can be specifically expressed as:
[0095]
[0096] in Represents the smoothing loss of the illumination decomposition module, specifically involving the input face image and the target face image, Represents the total variation value of the image, λ g A weight parameter to adjust the smoothness of the image.
[0097] The target loss function of the illumination decomposition module is a weighted combination of losses of different learning tasks, and the final loss function can be expressed as:
[0098]
[0099] in and Respectively represent the weight parameters of different losses in the illumination decomposition module.
[0100] Step 3: Construct the face reconstruction module. This module is composed of a codec convolutional neural network, which is used to reconstruct the illumination component of the face image, and adjust the illumination component of the low-light face image to the target illumination level.
[0101] Step 301: The face reconstruction module uses a U-NET encoding and decoding network model to reconstruct the face illumination component, and its input includes three parts: the face image reflection component R in , the face image illumination component I in and target lighting label l tar , where the illumination label adopts the one-hot encoding method. The output of the face reconstruction module is the adjusted face illumination component I rec. In the encoding network, the 3×3 convolution kernel is used to extract the illumination invariant information of the face image, and the decoding network uses deconvolution to upsample the feature map. A skip connection strategy is used between the encoding network and the decoding network to obtain face detail information. The operation of the face reconstruction module can be specifically expressed as:
[0102] I rec =Rec(R in , I in | l tar ) (6)
[0103] where Rec( ) represents the face reconstruction module. R in , I in and l tar Represent the decomposed reflection component, illumination component and target illumination label respectively, I rec is the reconstructed face illumination component.
[0104] Step 302: The face reconstruction module combines pixel regression learning and generative adversarial learning to reconstruct face illumination components, and its objective function consists of the following parts.
[0105] (1) Pixel regression loss: the face illumination component I output by the face reconstruction module rec with the reflection component R in The face reconstruction image S can be obtained by multiplying matrix elements rec. Face reconstruction image and target image S tar The L1 norm distance of is the pixel regression loss of the face reconstruction module, which can be specifically defined as:
[0106]
[0107] in Represents the pixel regression loss of the face reconstruction module. R in , I rec , S tar Represent the reflection component of the input image, the illumination component of the reconstructed image, and the target image, respectively.
[0108] (2) Cyclic consistency loss: The face reconstruction module uses a closed-loop structure to retain the content information of the reconstructed image. Specifically, the adjusted face illumination component I rec is re-sent into the face reconstruction module, in the light label l in Under the guidance of , the illumination component can be adjusted to the illumination level of the input image, and its loss function can be specifically expressed as:
[0109]
[0110] in Represents the cycle consistency loss of the face reconstruction module. I in , R in , I rec respectively represent the illumination component of the input image, the reflection component, and the illumination component of the reconstructed image. l in Represents the lighting label of the input image.
[0111] (3) Smoothing loss: The full variational model is also used to smooth the illumination component output by the face reconstruction module. The loss function can be specifically expressed as
[0112]
[0113] in Denotes the smoothing loss of the reconstructed face image, Represents the total variation value of the image, λ g A weight parameter to adjust the smoothness of the image.
[0114] (4) Confrontation loss: The face reconstruction module uses a generative confrontation learning method to synthesize a face image S rec , so that the discriminator module cannot judge the synthetic face image S rec The authenticity of , the loss function can be specifically expressed as:
[0115]
[0116] in Denotes the adversarial loss of the face reconstruction module, S rec Represents a synthetic face image, D(·) represents a discriminator module, D src ( ) output S rec The probability of being discriminated as a real face image, while the least squares distance is used to improve the stability of generative adversarial learning.
[0117] (5) Label classification loss: The face reconstruction module uses the target illumination label as a guide to synthesize a face image with a specific illumination level, so that the discriminator module can correctly classify the illumination level. The loss function can be specifically defined as:
[0118]
[0119] in Indicates the label classification loss of the face reconstruction module, l tar Indicates the target light level, D cls Denotes a synthetic face image S rec Probability of being correctly classified as the target light level.
[0120] (6) Perceptual loss: the synthetic face image S output by the face reconstruction module rec is sent to the face verification module to ensure that the synthesized face image S rec with the input image S in , target image S tar With the same face identity information, its loss function can be specifically defined as:
[0121]
[0122] in Represents the perceptual loss of the face reconstruction module, φ( ) represents the identity feature vector output by the face verification module, and the L2 norm distance is used to measure φ(S rec ), φ(S in ) and φ(S tar ) similarity.
[0123] The target loss function of the face reconstruction module is a weighted combination of losses of different learning tasks, and the final loss function can be expressed as:
[0124]
[0125] in and Respectively represent the weight parameters of different losses in the face reconstruction module.
[0126] Step 4: Build the discriminator module. The discriminator module learns to distinguish target face images from synthetic face images through generative adversarial learning, and classifies the illumination level of face images. The module consists of a convolutional neural network whose objective function consists of the following components.
[0127] (1) Adversarial loss: the discriminator module simultaneously inputs the target face image and the synthetic face image, and judges the authenticity of this group of images. The loss function can be specifically defined as:
[0128]
[0129] in Denotes the adversarial loss of the discriminator module, S rec Represents a synthetic face image, S tar represents the target face image.
[0130] (2) Label classification loss: The discriminator module inputs the target face image and classifies the illumination level. The loss function can be specifically defined as:
[0131]
[0132] in Denotes the label classification loss of the discriminator module, l tar Indicates the target light level, D cls Denotes the target face image S tar Probability of being correctly classified as the target light level.
[0133] The target loss function of the discriminator is a weighted combination of losses of different learning tasks, and the final loss function can be expressed as:
[0134]
[0135] in Respectively represent the weight parameters of different losses in the discriminator module.
[0136] Step 5: Build a face verification module, which consists of a pre-trained VGGFace network to guarantee the synthetic face image S rec with the input image S in , target image S tar have the same face verification information. This module is only used to extract face identity features and transfer perceptual loss without parameter update.
[0137]Step 6: Model training, use the Pytorch open source library to build a deep integrated neural network model, use NVIDIATITAN X GPU, and train the face lighting processing model under the Ubuntu 18.04 operating system;
[0138] Step 601: Train the illumination decomposition module separately, so that the module can decompose the input face image into reflection components and illumination components, such as image 3 shown.
[0139] Step 602: Train the entire face illumination processing framework, including illumination decomposition module, face reconstruction module, discriminator module and face verification module. The overall schematic diagram of the framework is shown in Figure 4 shown.
[0140] Step 7: Use the trained model to test the lighting processing results. Given an input face image and a target illumination level label, the model outputs a synthetic face image after illumination processing, such as Figure 5 shown.
[0141] The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any other form, and any modification or equivalent change made according to the technical essence of the present invention still belongs to the scope of protection required by the present invention .

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products