Facial contour automatic smoothing method and system based on cnn, and storage medium
By employing a CNN-based automatic facial contour smoothing method, which utilizes facial contour mask maps and optical flow maps, the problem of poor automatic contour adjustment in existing technologies is solved. This method achieves facial contour smoothing in different scenarios, improving the robustness of the model and the clarity of the output.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- XIAMEN MEITUZHIJIA TECH
- Filing Date
- 2023-04-20
- Publication Date
- 2026-06-26
Smart Images

Figure CN116452453B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image processing technology, and in particular to a CNN-based automatic facial contour smoothing method, as well as an image processing apparatus, device, and computer-readable storage medium for applying the method. Background Technology
[0002] As the saying goes, "True beauty lies in the bone structure, not just the skin," meaning that beauty is not about individual features, but rather the skeletal structure of the face. A visually appealing facial contour is often determined by its contour lines. The outer contour of the face is what we commonly refer to as the face shape, and everyone's face shape is different; there is no single, universally accepted standard for a beautiful face shape. Facial contour lines provide a direct visual impression; smooth curves give the face a sense of enclosure, making it appear firm, full, youthful, and vibrant. The visual perception of the facial contour is primarily influenced by the forehead, temples, brow ridge, cheekbone, jaw angle, and chin. The outer contour determines the overall smoothness of the face's shape; if the overall flow is natural, a sophisticated look naturally emerges.
[0003] From a general aesthetic perspective, regardless of whether it's an oval, heart-shaped, round, or square face, smooth facial contours are a fundamental requirement for enhancing beauty. However, in reality, not everyone has smooth facial contours. Many people seeking cosmetic enhancement face various facial issues, such as prominent cheekbones or facial hollowing, and urgently need to improve the aesthetics of their facial contours. Therefore, when taking photos, users also prefer smooth, flowing facial lines to highlight a youthful and full-bodied appearance.
[0004] However, most beauty apps on the market currently use traditional image processing algorithms that rely on facial point detection. They analyze facial contour defects based on the location of these points and then adjust different parts of the face accordingly. Therefore, errors in the detection of certain facial points can result in an uneven or even less smooth facial appearance after the user takes a photo, requiring manual contour adjustments from the user to achieve a smoother look. Especially in extreme cases (such as when the face is obscured or when vision is obscured by glasses), adjusting the facial contour based on facial points can cause even more severe distortion in those areas, necessitating further manual adjustments from the user.
[0005] Therefore, existing facial contour smoothing methods cannot meet users' needs for automated contour adjustment, and the adjustment effect is unsatisfactory. Summary of the Invention
[0006] The main objective of this invention is to provide a CNN-based automatic facial contour smoothing method, system, and storage medium, aiming to solve the technical problem that existing CNN-based automatic facial contour smoothing methods cannot meet users' needs for automated contour adjustment and have poor adjustment effects.
[0007] To achieve the above objectives, this invention provides a CNN-based automatic facial contour smoothing method, comprising the following steps: acquiring a second face image and a facial contour mask of the image to be smoothed, and performing a multiplication operation on the two to calculate the region of interest (ROI) of the facial contour; inputting the ROI of the facial contour as an input image into a pre-trained facial contour smoothing generation network to generate a facial contour smoothing optical flow map; magnifying the facial contour smoothing optical flow map to obtain a magnified optical flow map; applying the magnified optical flow map to a first face image to obtain a pre-smoothed facial contour result; and performing an affine transformation on the smoothed facial contour result to restore it to the same size as the image to be smoothed, thereby obtaining the smoothed facial contour result.
[0008] Optionally, the second face image and facial contour mask of the image to be smoothed are obtained, specifically including the following steps: obtaining the image to be smoothed, performing face detection and face alignment processing on it to obtain the first face point set P of the image to be smoothed, calculating its bounding rectangle, and then expanding it outward to obtain the cropping rectangle of the face; performing affine transformation processing on the cropping rectangle of the face to obtain the affine transformation matrix of the cropping rectangle of the face, and cropping out the face image to obtain the first face image F. At the same time, transforming the first face point set P into the coordinates of the first face image F to obtain the second face point set FP; cropping out the face image based on the second face point set FP, and scaling it to a preset size to obtain the second face image fi and the third face point set fp; obtaining the facial contour mask fm based on the facial contour points in the third face point set fp.
[0009] Optionally, the training input image and target output image for the facial contour smoothing generation network training phase are obtained as follows: The original dataset is collected, and the second face image fi and the third face point set fp of the original dataset are obtained; the positions of some points in the third face point set fp are offset, and a third face image fi' with an uneven facial contour is obtained through a triangular mesh-based remapping method; the third face image fi' is used as the training input image for the facial contour smoothing generation network; based on the third face image fi' and the second face image fi, the target optical flow map f is calculated using an optical flow algorithm. target This image is then used as the target output image of the facial contour smoothing generation network.
[0010] Optionally, occlusion data gain can be added during the training phase of the facial contour smoothing generation network, specifically by adding occlusions to the training input image and the target output image.
[0011] Optionally, the facial contour smoothing optical flow map can be magnified, specifically by magnifying the facial contour smoothing optical flow map to its original resolution, which is consistent with the resolution of the image to be smoothed.
[0012] Optionally, the facial contour smoothing generation network adopts an encoder-decoder network structure. This network uses a feature map merging connection method to connect the feature maps of the encoder and the corresponding feature maps of the decoder to reuse the feature maps of the encoder and extract low-level feature information.
[0013] Optionally, the facial contour smoothing generation network adopts a learning method based on predicted optical flow maps. The network finally outputs a 3-channel optical flow map, where the third channel is assigned a value of 255 for all channels, and the first and second channels represent the x-direction offset value and y-direction offset value of the pixel at this position, respectively.
[0014] Optionally, the total loss L during the training phase of the facial contour smoothing generation network... Total =αL1+βL Perc ;
[0015] Where α and β are the L1 loss and L... Perc The weights corresponding to the loss, L1 loss and L Perc The formula for calculating the loss is as follows;
[0016]
[0017] Where W is the width of the preset dimension and H is the height of the preset dimension. This represents the different loss map weighting factors at different locations. Output the target image, G(fi) input *fm) x,y This represents the output image of the facial contour smoothing generation network.
[0018]
[0019] Where, φ j This represents the feature map output by the last convolutional layer of the j-th module in the VGG16 network.
[0020] Corresponding to the aforementioned CNN-based automatic facial contour smoothing method, this invention provides a CNN-based automatic facial contour smoothing system, comprising: a calculation module for acquiring a second face image and a facial contour mask of the image to be smoothed, and performing a multiplication operation on the two to calculate the region of interest (ROI) of the facial contour; a facial contour smoothing generation network module for acquiring the ROI of the facial contour as an input image and inputting it into a pre-trained facial contour smoothing generation network to generate a facial contour smoothing optical flow map; an optical flow map restoration module for magnifying the facial contour smoothing optical flow map to obtain a magnified optical flow map; a facial contour pre-smoothing module for applying the magnified optical flow map to a first face image to obtain a facial contour pre-smoothing result; and an affine transformation module for performing an affine transformation on the facial contour smoothing result to restore it to the same size as the image to be smoothed, thereby obtaining a facial contour smoothing result.
[0021] In addition, to achieve the above objectives, the present invention also provides a computer-readable storage medium storing a CNN-based automatic facial contour smoothing program, which, when executed by a processor, implements the steps of the CNN-based automatic facial contour smoothing method described above.
[0022] The beneficial effects of this invention are:
[0023] (1) Compared with existing technologies, traditional image algorithms need to predict facial defects by accurately identifying the location of facial points before performing beautification operations. This invention does not rely on the location of facial points. It only needs to input the calculated region of interest of the facial contour into the pre-trained facial contour smoothing generation network to intelligently and accurately smooth facial contours of different scenes and defects, thus meeting the user's needs for automated contour adjustment.
[0024] (2) The present invention adds occlusions to the training input image and the target output image to improve the robustness of the model. In some extreme cases, such as when the face is occluded or the face is deformed under the view of glasses, a natural and accurate facial contour smoothing effect can be achieved.
[0025] (3) By learning optical flow maps, this invention can directly apply low-resolution optical flow prediction results to high-resolution images, reducing the amount of computation while ensuring the clarity of the output facial contour smoothing results. Attached Figure Description
[0026] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this invention, illustrate exemplary embodiments of the invention and are used to explain the invention, but do not constitute an undue limitation of the invention. In the drawings:
[0027] Figure 1 This is a simplified flowchart of the CNN-based automatic facial contour smoothing method of the present invention. Detailed Implementation
[0028] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention and are not intended to limit the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0029] like Figure 1 As shown, the present invention provides a CNN-based automatic facial contour smoothing method, which includes the following steps: acquiring a second face image and a facial contour mask of the image to be smoothed, and performing a multiplication operation on the two to calculate the region of interest (ROI) of the facial contour; inputting the ROI of the facial contour as an input image into a pre-trained facial contour smoothing generation network to generate a facial contour smoothing optical flow map; magnifying the facial contour smoothing optical flow map to obtain a magnified optical flow map; applying the magnified optical flow map to a first face image to obtain a facial contour pre-smoothing result; and performing an affine transformation on the facial contour smoothing result to restore it to the same size as the image to be smoothed, thereby obtaining the facial contour smoothing result.
[0030] Compared to existing technologies, traditional image processing algorithms require precise prediction of facial defects based on the location of facial points before performing beautification operations. This invention does not rely on the location of facial points; it only requires inputting the calculated region of interest of the facial contour into a pre-trained facial contour smoothing generation network to intelligently and accurately smooth facial contours in different scenarios and with different defects, thus meeting users' needs for automated contour adjustment.
[0031] In this embodiment, obtaining the second face image and facial contour mask of the image to be smoothed specifically includes the following steps: obtaining the image to be smoothed, performing face detection and face alignment processing on it to obtain the first face point set P of the image to be smoothed, calculating its bounding rectangle, and then expanding it outward to obtain the cropping rectangle of the face; performing affine transformation processing on the cropping rectangle of the face to obtain the affine transformation matrix of the cropping rectangle of the face, and cropping out the face image to obtain the first face image F. At the same time, transforming the first face point set P into the coordinates of the first face image F to obtain the second face point set FP; cropping out the face image based on the second face point set FP and scaling it to a preset size to obtain the second face image fi and the third face point set fp; obtaining the facial contour mask fm based on the facial contour points in the third face point set fp.
[0032] In this embodiment, outward expansion means that the circumscribed rectangle can be expanded outward to a certain extent in all directions (top, bottom, left, and right) to avoid direct cropping that results in edges that are too close to the image.
[0033] In this embodiment, the magnified optical flow map is directly applied to the unscaled first face image, ensuring image clarity.
[0034] In this embodiment, based on the affine transformation matrix of the cropped rectangle of the face, the affine transformation process of the facial contour smoothing result is performed, its inverse matrix is calculated, and it is restored to the same size as the image to be smoothed, thus obtaining the facial contour smoothing result.
[0035] In this embodiment, the training input image and target output image for the facial contour smoothing generation network are obtained as follows: The original dataset is collected, and the second face image fi and the third face point set fp of the original dataset are obtained; the positions of some points in the third face point set fp are offset, and a third face image fi' with an uneven facial contour is obtained through a triangular mesh-based remapping method; the third face image fi' is used as the training input image for the facial contour smoothing generation network; based on the third face image fi' and the second face image fi, the target optical flow map f is calculated using an optical flow algorithm. target This image is then used as the target output image of the facial contour smoothing generation network.
[0036] The positions of certain points in the third face point set fp are offset, preferably by widening the mandible and making the cheekbone more prominent.
[0037] In this embodiment, occlusion data gain is added during the training phase of the facial contour smoothing generation network. Specifically, occlusions are added to the training input image and the target output image. The preferred method for adding occlusions is to perform glasses mapping to avoid environmental interference.
[0038] This invention adds occlusions to the training input image and the target output image to improve the robustness of the model. In some extreme cases, such as when the face is occluded or when there is facial distortion under the perspective of glasses, a natural and accurate facial contour smoothing effect can still be achieved.
[0039] In this embodiment, the facial contour smoothing optical flow map is magnified, specifically by magnifying the facial contour smoothing optical flow map to its original resolution, which is consistent with the resolution of the image to be smoothed.
[0040] In this embodiment, the facial contour smoothing generation network adopts an encoder-decoder network structure. This network uses a feature map merging connection method to connect the feature maps of the encoder and the corresponding size feature maps of the decoder, so as to reuse the feature maps of the encoder and extract the low-level feature information.
[0041] In this embodiment, the facial contour smoothing generation network adopts the learning method of predictive optical flow map. The network finally outputs a 3-channel optical flow map, in which the third channel is assigned a value of 255, and the first and second channels represent the x-direction offset value and y-direction offset value of the pixel at this position, respectively.
[0042] This invention uses optical flow maps to directly apply low-resolution optical flow prediction results to high-resolution images, reducing computational load while ensuring the clarity of the output facial contour smoothing results.
[0043] In this embodiment, the total loss L during the training phase of the facial contour smoothing generation network is... Total =αL1+βL Perc ;
[0044] Where α and β are the L1 loss and L... Perc The weights corresponding to the loss.
[0045] In this embodiment, to ensure focused monitoring of the temples, zygomatic arch, and mandibular angle regions, weights related to different areas are added as monitoring parameters. Specifically, using facial contour points from the third face point set fp, a facial contour mask fm is obtained. Based on the location of the third face point set fp, the weights of the contour areas near the temples, zygomatic arch, mandibular angle, and chin are set to 1.0, while the weights of other contour areas are set to 0.5. This weighted facial contour mask is denoted as fm*. The calculation formula for L1 is as follows:
[0046]
[0047] Where W is the width of the preset dimension and H is the height of the preset dimension. This represents the different weighting factors for the loss map at different locations. The target output image is G, which represents the facial contour smoothing generation network. G(fi) input *fm) x,y This represents the output image of the facial contour smoothing generation network. In this embodiment, L Perc =L Perc / j The specific calculation formula is as follows:
[0048]
[0049] Where, φ j This represents the feature map output by the last convolutional layer of the j-th module in the VGG16 network. The VGG16 network is a commonly used network for calculating perceptual loss, and it is used to participate in the calculation after outputting the feature map.
[0050] Preferably, the present invention uses the Adam optimization solver, with an initial learning rate of 0.0002 and 200K iterations. The parameters are adjusted to α = 1 and β = 0.5 after actual training.
[0051] This invention also provides a CNN-based automatic facial contour smoothing system, comprising: a calculation module for acquiring a second face image and a facial contour mask of the image to be smoothed, and performing a multiplication operation on the two to calculate a region of interest (ROI) for the facial contour; a facial contour smoothing generation network module for acquiring the ROI as an input image and inputting it into a pre-trained facial contour smoothing generation network to generate a facial contour smoothing optical flow map; an optical flow map restoration module for magnifying the facial contour smoothing optical flow map to obtain a magnified optical flow map; a facial contour pre-smoothing module for applying the magnified optical flow map to a first face image to obtain a facial contour pre-smoothing result; and an affine transformation module for performing an affine transformation on the facial contour smoothing result to restore it to the same size as the image to be smoothed, thereby obtaining a facial contour smoothing result.
[0052] This invention also provides a computer-readable storage medium storing at least one instruction, which is loaded and executed by a processor to implement... Figure 1 The illustrated method is a CNN-based automatic facial contour smoothing method. The computer-readable storage medium may be a read-only memory, a disk, or an optical disk, etc.
[0053] It should be noted that the various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the device embodiments, equipment embodiments, and storage medium embodiments, since they are basically similar to the method embodiments, the descriptions are relatively simple, and relevant parts can be referred to the descriptions of the method embodiments.
[0054] Furthermore, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0055] The foregoing description illustrates and describes preferred embodiments of the present invention. It should be understood that the present invention is not limited to the forms disclosed herein and should not be construed as excluding other embodiments. It can be used in various other combinations, modifications, and environments, and can be altered within the scope of the inventive concept by means of the foregoing teachings or techniques or knowledge in related fields. Any modifications and variations made by those skilled in the art that do not depart from the spirit and scope of the present invention should be within the protection scope of the appended claims.
Claims
1. A CNN-based automatic facial contour smoothing method, characterized in that, Includes the following steps: Obtain the second face image and facial contour mask of the image to be smoothed, and perform a multiplication operation on the two to calculate the region of interest of the facial contour. The region of interest in the facial contour is used as the input image and fed into a pre-trained facial contour smoothing generation network to generate a facial contour smoothing optical flow map. The smooth optical flow map of the facial contour is magnified to obtain the magnified optical flow map; The magnified optical flow map is applied to the first face image to obtain the pre-smoothed facial contour result; Affine transformation is performed on the smoothed facial contour result to restore it to the same size as the image to be smoothed, thus obtaining the smoothed facial contour result. Obtaining the second face image and facial contour mask of the image to be smoothed involves the following steps: The image to be smoothed is obtained, and face detection and face alignment are performed on it to obtain the first face point set P of the image to be smoothed. The bounding rectangle is calculated and then expanded outward to obtain the cropping rectangle of the face. Affine transformation is performed on the cropping rectangle of the face to obtain the affine transformation matrix of the cropping rectangle of the face, and the face image is cropped to obtain the first face image F. At the same time, the first face point set P is transformed into the coordinates of the first face image F to obtain the second face point set FP. The face image is cropped from the second face point set FP and scaled to a preset size to obtain the second face image fi and the third face point set fp; Based on the facial contour points in the third face point set fp, the facial contour mask fm is obtained; The training input image and target output image for the facial contour smoothing generation network are obtained as follows: Collect the original dataset and obtain the second face image fi and the third face point set fp from the original dataset; The positions of certain points in the third face point set fp are offset, and a third face image fi' with an uneven facial contour is obtained by a remapping method based on triangular mesh. The third face image fi' is used as the training input image for the facial contour smoothing generation network; Based on the third face image fi' and the second face image fi, the target optical flow map f is calculated using an optical flow algorithm. target And use it as the target output image of the facial contour smoothing generation network; The facial contour smoothing generation network adopts an encoder-decoder network structure. This network uses a feature map merging connection method to connect the feature maps of the encoder and the corresponding feature maps of the decoder, so as to reuse the feature maps of the encoder and extract the low-level feature information.
2. The CNN-based automatic facial contour smoothing method according to claim 1, characterized in that: Occlusion data gain is added during the training phase of the facial contour smoothing generation network, specifically by adding occlusions to the training input image and the target output image.
3. The CNN-based automatic facial contour smoothing method according to claim 1, characterized in that: The facial contour smoothing optical flow map is magnified, specifically by magnifying the facial contour smoothing optical flow map to its original resolution, which is consistent with the resolution of the image to be smoothed.
4. The CNN-based automatic facial contour smoothing method according to claim 1, characterized in that: The facial contour smoothing generation network adopts a learning method based on predicted optical flow maps. The network finally outputs a 3-channel optical flow map, where the third channel is assigned a value of 255 for all channels. The first channel represents the x-direction offset value of the pixel at this position, and the second channel represents the y-direction offset value of the pixel at this position.
5. The CNN-based automatic facial contour smoothing method according to claim 1, characterized in that: Total loss during the training phase of the facial contour smoothing generation network ; in, The weights corresponding to L1 loss, for The weights corresponding to the loss, L1 loss and The formula for calculating the loss is as follows; ; Where W is the width of the preset dimension and H is the height of the preset dimension. This represents the different weighting factors for the loss map at different locations. Output image for target. This represents the output image of the facial contour smoothing generation network. in, This represents the feature map output by the last convolutional layer of the j-th module in the VGG16 network.
6. A CNN-based automatic facial contour smoothing system, using the CNN-based automatic facial contour smoothing method according to any one of claims 1-5, characterized in that, include: The calculation module is used to acquire the second face image and the facial contour mask of the image to be smoothed, and to perform a multiplication operation on the two to calculate the region of interest of the facial contour. The facial contour smoothing generation network module is used to obtain the region of interest of the facial contour as the input image and input it into the pre-trained facial contour smoothing generation network to generate a facial contour smoothing optical flow map. The optical flow map restoration module is used to magnify the smooth optical flow map of facial contours to obtain a magnified optical flow map; The facial contour pre-smoothing module is used to apply the magnified optical flow map to the first face image to obtain the facial contour pre-smoothing result; The affine transformation module is used to perform affine transformation processing on the facial contour smoothing result, restoring it to the same size as the image to be smoothed, thus obtaining the facial contour smoothing result.
7. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a CNN-based automatic facial contour smoothing program, which, when executed by a processor, implements the steps of the CNN-based automatic facial contour smoothing method as described in any one of claims 1 to 5.