Face replacement method, face replacement device, electronic equipment and storage medium

By segmenting and transforming the driving image and the specific image, the facial pose in the generated target image is naturally matched, which solves the problem of poor face replacement effect in the existing technology and achieves a natural effect that is difficult to distinguish from the real one.

CN115713458BActive Publication Date: 2026-06-19CHINA MOBILEHANGZHOUINFORMATION TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA MOBILEHANGZHOUINFORMATION TECH CO LTD
Filing Date
2021-08-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

The visual effect of the replaced face image in the existing technology is poor, making it difficult to achieve a natural look and distinguish between real and fake images.

Method used

The driving image and the specific image are segmented by a preset image segmentation model to obtain the mask image and pose transformation parameters. The second face is transformed based on these parameters, and the target image is generated by combining the mask image to ensure that the face pose is consistent with the original driving image.

Benefits of technology

The generated target image features a natural and realistic facial pose that is difficult to distinguish from a fake image, resulting in excellent performance and meeting real-time requirements.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115713458B_ABST
    Figure CN115713458B_ABST
Patent Text Reader

Abstract

This application discloses a face replacement method, face replacement device, electronic device, and storage medium. The method includes: performing image segmentation on an acquired driving image using a preset image segmentation model to obtain a first mask image; performing image segmentation on an acquired specific image using the preset image segmentation model to obtain a second mask image; acquiring pose transformation parameters of a first face in the driving image and a second face in the specific image; performing transformation processing on the second face based on the pose transformation parameters to obtain a transformed second face, wherein the pose of the face in the transformed second face is the same as the pose of the face in the first face; and generating a target image with face replacement corresponding to the driving image based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of information processing, and in particular to a face replacement method, face replacement device, electronic device, and storage medium. Background Technology

[0002] With the continuous development of internet technology and the increasing maturity of image processing and artificial intelligence technologies, there is a growing number of entertainment and social applications. People can now use these applications to swap faces, replacing the face of a specified person with the face of another person, thus achieving the purpose of face-swapping for entertainment.

[0003] In related technologies, face replacement involves using simple face detection and region swapping to directly replace the face of a specified object with the face of another specific object using image cutout, thus achieving the purpose of face swapping. However, this method suffers from poor visual quality after face swapping. Summary of the Invention

[0004] This application provides a face replacement method, face replacement device, electronic device, and storage medium to solve the problem of poor visual effect of the image after face replacement in related technologies.

[0005] The technical solution of this application is implemented as follows:

[0006] This application provides a face replacement method, the method comprising:

[0007] The acquired driving image is segmented using a preset image segmentation model to obtain the first mask image;

[0008] The acquired specific image is segmented using the preset image segmentation model to obtain a second mask image;

[0009] Obtain the pose transformation parameters of the first face in the driving image and the second face in the specific image;

[0010] Based on the pose transformation parameters, the second face is transformed to obtain the transformed second face, wherein the pose of the face in the transformed second face is the same as the pose of the face in the first face;

[0011] Based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image, a target image with face replacement corresponding to the driving image is generated.

[0012] This application provides a face replacement device, the device comprising:

[0013] The first processing module is used to perform image segmentation on the acquired driving image using a preset image segmentation model to obtain the first mask image;

[0014] The first processing module is further configured to perform image segmentation on the acquired specific image using the preset image segmentation model to obtain a second mask image;

[0015] The acquisition module is used to acquire the pose transformation parameters of the first face in the driving image and the second face in the specific image;

[0016] The second processing module is used to transform the second face based on the pose transformation parameters to obtain the transformed second face, wherein the face pose in the transformed second face is the same as the face pose in the first face.

[0017] The second processing module is further configured to generate a target image with face replacement corresponding to the driving image based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image.

[0018] This application provides an electronic device, the electronic device comprising: a memory for storing executable instructions; and a processor for executing the executable instructions stored in the memory to implement the face replacement method described above.

[0019] This application provides a computer storage medium storing one or more programs, which can be executed by one or more processors to implement the face replacement method described above.

[0020] This application provides a face replacement method, face replacement device, electronic device, and storage medium. The method involves segmenting an acquired driving image using a preset image segmentation model to obtain a first mask image; segmenting a specific image using the same preset image segmentation model to obtain a second mask image; acquiring pose transformation parameters for a first face in the driving image and a second face in the specific image; transforming the second face based on these pose transformation parameters to obtain a transformed second face, wherein the pose of the transformed second face is identical to that of the first face; and generating a target image with face replacement corresponding to the driving image based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image. Thus, by transforming the second face based on the pose transformation parameters between the first face in the driving image and the second face in the target image, the face pose in the generated target image with face replacement closely matches the pose of the original driving image, ensuring that the face pose in the target image is sufficiently natural, difficult to distinguish from a fake, and yields excellent results. Attached Figure Description

[0021] Figure 1 A flowchart illustrating an optional face replacement method provided in an embodiment of this application;

[0022] Figure 2 A schematic diagram of the model structure of an optional face replacement method provided in an embodiment of this application;

[0023] Figure 3 A flowchart illustrating an optional face replacement method provided in an embodiment of this application;

[0024] Figure 4 A flowchart illustrating an optional face replacement method provided in an embodiment of this application;

[0025] Figure 5 A flowchart illustrating an optional face replacement method provided in an embodiment of this application;

[0026] Figure 6 A flowchart illustrating an optional face replacement method provided in an embodiment of this application;

[0027] Figure 7 This is a schematic diagram of the structure of a face replacement device provided in an embodiment of this application;

[0028] Figure 8 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0029] The technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Here, "another" or "yet another" mentioned in the description of the drawings does not refer to a specific embodiment. The various embodiments of this application can be combined with each other without conflict.

[0030] It should be understood that the phrases "embodiments of this application" or "foreign embodiments" throughout the specification mean that a specific feature, structure, or characteristic related to an embodiment is included in at least one embodiment of this application. Therefore, "embodiments of this application" or "in the foreign embodiments" appearing throughout the specification do not necessarily refer to the same embodiment. Furthermore, these specific features, structures, or characteristics can be combined in any suitable manner in one or more embodiments. In the various embodiments of this application, the sequence numbers of the above-described processes do not imply a sequential order of execution; the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application. The sequence numbers of the above-described embodiments are merely descriptive and do not represent the superiority or inferiority of the embodiments.

[0031] See Figure 1, Figure 1 This is a flowchart illustrating an optional face replacement method provided in an embodiment of this application. The face replacement method is applied to an electronic device and includes the following steps:

[0032] Step 101: Perform image segmentation on the acquired driving image using a preset image segmentation model to obtain the first mask image.

[0033] Step 102: Perform image segmentation on the acquired specific image using a preset image segmentation model to obtain the second mask image.

[0034] In this embodiment of the application, the driving image is an image in which a predetermined region needs to be processed. Here, the predetermined region includes at least a background region, a clothing region, a skin region, a face region, and a hair region.

[0035] In this embodiment of the application, the specific image is an image in which the head region needs to be processed. The predetermined region includes at least the background region, clothing region, skin region, face region and hair region, and the head region in the specific image is replaced with the head region in the driving image.

[0036] In this embodiment, the mask image can be understood as an image composed of the outer contours of each region obtained by segmenting a predetermined region within the input image. Here, the input image includes a driving image and a specific image.

[0037] In this embodiment, a preset image segmentation model is used to segment images within predetermined regions of an input image. Here, the predetermined regions include background regions, clothing regions, skin regions, face regions, and hair regions in the image. In one feasible application scenario, the preset image segmentation model first identifies each predetermined region in the input image, and then segments each predetermined region based on the identification results to obtain a mask image corresponding to the input image. For example, the electronic device segments the identified background regions, clothing regions, skin regions, face regions, and hair regions in the input image and outputs a mask image corresponding to the input image.

[0038] Here, the preset image segmentation model includes, but is not limited to, the U-net network model and the fully convolutional network model. For example, in this embodiment, the preset image segmentation model is the U-net network model. The U-net network model consists of convolutional layers and upsampling layers, which is equivalent to an encoder and a decoder. It extracts features through convolution and concatenates and fuses the features obtained from each convolutional layer with the features of the corresponding upsampling layer to achieve a better segmentation result.

[0039] In this embodiment, the electronic device acquires a driving image and a specific image, performs image segmentation on the driving image using a preset image segmentation model to obtain a first mask image corresponding to the driving image, and performs image segmentation on the specific image using the preset image segmentation model to obtain a second mask image corresponding to the specific image. In this way, by performing image segmentation on the driving image and the specific image, the final face-swapping structure is made more refined, while maintaining the overall face shape.

[0040] In practical applications, electronic devices may include, but are not limited to, mobile terminal devices such as smartphones, tablets, laptops, smart TVs, personal digital assistants (PDAs), cameras, and wearable devices, as well as fixed terminal devices such as desktop computers.

[0041] Step 103: Obtain the pose transformation parameters of the first face in the driving image and the second face in the specific image.

[0042] In this embodiment, the pose transformation parameter is the change in the relative position parameters between the same feature points of the first face in the driving graphic and the second face in a specific image.

[0043] Here, the pose transformation parameters between the first and second faces can be implemented through a face pose estimation module. Face pose estimation involves analyzing face images to obtain the angular information of the face's orientation. Pose estimation is a crucial step in multi-pose problems. It can generally be represented using rotation matrices, rotation vectors, quaternions, or Euler angles. Face pose changes typically include pitch, yaw, and in-plane angular rotation. In a feasible scenario, the electronic device obtains the rotation vector, i.e., the pose transformation parameters, using OpenCV's `solvePnP` function.

[0044] Step 104: Based on the pose transformation parameters, transform the second face to obtain the transformed second face.

[0045] The facial pose in the transformed second face is the same as that in the first face.

[0046] In this embodiment, after the electronic device obtains the pose transformation parameters of the first face in the driving image and the second face in the specific image, it performs transformation processing on the second face based on the pose transformation parameters to obtain the transformed second face with the same pose as the face in the first face.

[0047] Step 105: Based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image, generate the target image with the replaced face corresponding to the driving image.

[0048] In this embodiment of the application, the target image can be the image after replacing the face corresponding to the driving image, or the target image can be the image after replacing the face and head corresponding to the driving image.

[0049] In this embodiment of the application, the electronic device transforms the second face based on the pose transformation parameters to obtain the transformed second face. Then, the pose transformation parameters, the transformed second face, the first mask image, and the second mask image are input into the generation model to obtain the target image with face replacement corresponding to the driving image output by the generation model.

[0050] Here, the generative model includes, but is not limited to, the U-net network model and the fully convolutional network model. For example, in this embodiment, the generative model uses the U-net network model, which includes an encoder and a decoder module. It can concatenate and fuse the features obtained from the convolutional layers into the deconvolutional layers, making the image generated by the decoder more natural and realistic. Here, the electronic device inputs a first mask image, ensuring that the regions in the driving image other than the first face are not deformed, maintaining their original state; and replaces the first face with a transformed second face, thereby obtaining the target image with the replaced face corresponding to the driving image output by the generative model.

[0051] In other embodiments of this application, see Figure 2 As shown, Figure 2 This is a schematic diagram of the model structure of an optional face replacement method provided in an embodiment of this application. During the training of the generation model, the generation model is adjusted using a loss function. Here, the loss function is...

[0052]

[0053] in, Let F be the target image, and N be the original image corresponding to the target image. i (.) represents the feature of the nth channel extracted from the base layer fixed after convolution by Visual Geometry Group-19 (VGG-19), where N is the number of features in that layer. Here, the base layer can be understood as the base layer corresponding to feature maps of the target image and the original image at different scales, such as 256×256, 128×128, 64×64, and 32×32. The number of features in each layer is determined by the dimension of the output of the convolutional layer. For example, if the output is 256×256×64, then 64 is the number of channels, which is determined by the number of convolutions in the current layer. Here, the electronic device calculates the loss value of each base layer and uses the loss value to improve the performance of the model, making the generated image details more refined, avoiding blurring, improving the fusion of face swapping, and minimizing any sense of incongruity.

[0054] This application provides a face replacement method. It involves segmenting an acquired driving image using a preset image segmentation model to obtain a first mask image; segmenting a specific image using the same preset image segmentation model to obtain a second mask image; acquiring pose transformation parameters for a first face in the driving image and a second face in the specific image; transforming the second face based on these pose transformation parameters to obtain a transformed second face, wherein the pose of the transformed second face is identical to that of the first face; and generating a target image with face replacement corresponding to the driving image based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image. Thus, by transforming the second face based on the pose transformation parameters between the first face in the driving image and the second face in the target image, the face pose in the generated target image with face replacement closely matches the pose of the original driving image, ensuring a sufficiently natural and indistinguishable face pose in the target image, and achieving excellent results.

[0055] See Figure 3 , Figure 3 This is a flowchart illustrating an optional face replacement method provided in an embodiment of this application. The face replacement method is applied to an electronic device and includes the following steps:

[0056] Step 201: Perform image segmentation on the acquired driving image using a preset image segmentation model to obtain the first mask image.

[0057] Step 202: Perform image segmentation on the acquired specific image using a preset image segmentation model to obtain the second mask image.

[0058] Step 203: Using a preset facial feature point model, determine the first position parameter of the i-th facial feature point among multiple facial feature points in the first face, and the second position parameter of the i-th facial feature point among multiple facial feature points in the second face.

[0059] In this embodiment, multiple facial feature points include, but are not limited to, feature points of the face contour, feature points of the eyebrow contour, feature points of the nose contour, feature points of the eye contour, and feature points of the mouth contour.

[0060] In this embodiment of the application, the position parameter of the face feature point is used to represent the position information of the i-th face feature point among multiple face feature points, where i is a positive integer greater than or equal to 1 and less than or equal to 1, and 1 is the total number of multiple face feature points.

[0061] In this embodiment, the method for determining the preset facial feature point model can be understood as determining multiple feature points from the facial contours, eyebrow contours, nose contours, and mouth contours of a facial image. During face recognition and detection, Active Shape Model (ASM), Active Appreance Model (AAM), or DLIB methods can be used to extract multiple facial feature points from the first face and multiple facial feature points from the second face based on the preset facial feature point model. DLIB is a C++ library for machine learning that contains many commonly used machine learning algorithms.

[0062] In this embodiment of the application, the electronic device determines multiple facial feature points in the first face and multiple facial feature points in the second face by using a preset facial feature point model. Then, it obtains the first position parameter of the i-th facial feature point in the first face and the second position parameter of the i-th facial feature point in the second face, so that the electronic device can perform pose estimation processing based on the first and second position parameters of the multiple facial feature points.

[0063] Step 204: Using a pre-established face pose estimation model, perform pose estimation processing on the first and second position parameters of multiple face feature points to obtain the pose transformation parameters between the face feature points of the first face and the second face.

[0064] In this embodiment, the pose transformation parameter is the change in the relative position parameters between the same feature points of the first face in the driving graphic and the second face in a specific image.

[0065] Here, the idea behind face pose estimation in the face pose estimation model is to rotate a standard 3D model by a certain angle until the "2D projection" of the "3D feature points" on the model coincides as much as possible with multiple face feature points, thereby determining the face orientation information. The pose estimation process involves three coordinate systems: the world coordinate system, the camera coordinate system, and the image coordinate system. A 3D point (U, V, W) in the world coordinate system is mapped to the camera coordinate system (X, Y, Z) through a rotation matrix R and a translation vector t. A 3D point (X, Y, Z) in the camera coordinate system is mapped to the image coordinate system (x, y) through the camera's intrinsic parameter matrix. After obtaining the first position parameters of the i-th face feature point among multiple face feature points of the first face, the first angle information of the face orientation corresponding to the first face is determined. After obtaining the second position parameters of the i-th face feature point among multiple face feature points of the second face, the second angle information of the face orientation corresponding to the second face is determined. Finally, the electronic device determines the pose transformation parameters between the face feature points of the first and second faces based on the first angle information corresponding to the first face and the second angle information corresponding to the second face.

[0066] In this embodiment of the application, the electronic device determines the first position parameter of the i-th face feature point among multiple face feature points in the first face and the second position parameter of the i-th face feature point among multiple face feature points in the second face through a preset face feature point model. Then, it performs pose estimation processing on the first and second position parameters of multiple face feature points through a pre-established face pose estimation model to obtain the pose transformation parameters between the face feature points of the first face and the second face.

[0067] Step 205: Based on the pose transformation parameters, transform the second face to obtain the transformed second face.

[0068] In this case, the facial pose in the transformed second face is the same as that in the first face.

[0069] In the embodiments of this application, see Figure 4 As shown, step 205 transforms the second face based on the pose transformation parameters to obtain the transformed second face, which can be achieved through the following steps:

[0070] Step 2051: Based on the pose transformation parameters, perform pose transformation processing on the second face to obtain the initial transformed face.

[0071] In this embodiment of the application, the initial transformed face is the face obtained by the second face after the pose transformation is performed based on the pose transformation parameters.

[0072] Step 2052: Fit the two-dimensional transformation between the current position of multiple feature points in the initially transformed face and the preset target position corresponding to the multiple feature points using an interpolation function to obtain the second face shape variable.

[0073] In this embodiment, the interpolation function includes, but is not limited to, the Thin-Plate Spline (TPS) function, the Regularized Spline function, and the Thin-Plate Tension Spline interpolation function. Thin-Plate Spline interpolation establishes a surface passing through the control points and minimizes the slope variation at all points; that is, the thin-plate spline function fits the control points with a surface of minimum curvature.

[0074] In this embodiment, the electronic device performs pose transformation processing on the second face based on pose transformation parameters to obtain an initial transformed face. Then, it uses an interpolation function to fit the two-dimensional transformation between the current position of multiple feature points in the initial transformed face and the preset target position corresponding to the multiple feature points to obtain the deformation of the second face.

[0075] Step 2053: Based on the deformation, perform interpolation transformation on the initial transformed face using an image interpolation algorithm to obtain the transformed second face.

[0076] In this embodiment, the image interpolation algorithm includes a bilinear interpolation sampling method. Here, the electronic device first obtains the pixel value of each pixel coordinate on the initial transformed face using the bilinear interpolation sampling method, determines the preset target position corresponding to the pixel value based on the deformation, and transmits the pixel value to that position to obtain the transformed second face.

[0077] It should be noted that since a human face is not a flat surface but has concave and convex surfaces, while an image is a two-dimensional planar information, thin plate spline interpolation of pixel values ​​is performed in the second face region to make the points in the second face have a better transformation effect, which does not appear abrupt or unnatural. Moreover, this method is simple and efficient in computation, effectively reducing the computation time.

[0078] Step 206: Based on the pose transformation parameters, perform pose transformation processing on the second head image in the second mask image to obtain the transformed second head image.

[0079] The head pose in the transformed second head image is the same as the head pose in the first head image in the first mask image.

[0080] In this embodiment of the application, the second head image includes a face region image and a hair region image.

[0081] In this embodiment of the application, the electronic device performs pose estimation processing on the first position parameters and the second position parameters of multiple facial feature points through a pre-established facial pose estimation model. After obtaining the pose transformation parameters between the facial feature points of the first face and the second face, it can also perform pose transformation processing on the second head image in the second mask image based on the pose transformation parameters to obtain a transformed second head image with the same head pose as the first head image in the first mask image.

[0082] Step 207: Generate the target image based on the transformed second face, the transformed second head image, and the first mask image.

[0083] In the embodiments of this application, see Figure 5 As shown, step 207 generates the target image based on the transformed second face, the transformed second head image, and the first mask image, which can be achieved through the following steps:

[0084] Step 2071: Replace the first head image in the first mask image with the transformed second head image to obtain the replaced first mask image.

[0085] Step 2072: Patch the transformed second face onto the second head image of the replaced first mask image to obtain the patched image.

[0086] Step 2073: Smooth the stitching position of the stitched image to obtain the target image.

[0087] In this embodiment of the application, the target image is the image after replacing the face and head corresponding to the driving image.

[0088] In this embodiment, the electronic device replaces the first head image in the first mask image with a transformed second head image to obtain a replaced first mask image. Simultaneously, the transformed second face is stitched onto the replaced second head image in the first mask image to obtain a stitched image. At this point, the second face and second head image in the stitched image, when combined with other areas in the first mask image, still have imperfections. Therefore, the electronic device smooths the stitching position of the stitched image to ensure a smoother interface, more natural lighting and edge color transitions, and a more realistic target image.

[0089] Here, smoothing the stitching position of the stitched image can be achieved in the following ways: Method 1, applying neighborhood median filtering to the stitching position; Method 2, applying mean filtering to the stitching position; Method 3, applying Gaussian filtering to the stitching position. This application does not impose specific limitations on these methods.

[0090] Step 208: Process the resolution of the target image using a super-resolution model to obtain a high-resolution target image, and output the high-resolution target image.

[0091] In this embodiment, after obtaining the target image, the electronic device inputs the target image into the super-resolution model to obtain a high-resolution target image output by the super-resolution model, and simultaneously displays the high-resolution target image. Here, the super-resolution model can be a Super Resolution Generative Adversarial (SR-GAN) network model. The SR-GAN network model can output a high-resolution target image in real time under a Graphics Processing Unit (GPU) environment, which can meet the needs of real-time audio and video calls.

[0092] As described above, in this embodiment, firstly, the electronic device acquires a first mask image corresponding to the driving image and a second mask image corresponding to the specific image. Secondly, based on the first position parameters of multiple facial feature points of the first face in the driving image and the second position parameters of multiple facial feature points of the second face in the specific image, the electronic device obtains pose transformation parameters between the facial feature points of the first and second faces. Thirdly, the electronic device performs pose transformation processing on the second face based on the pose transformation parameters to obtain the transformed second face; and performs pose transformation processing on the second head image in the second mask image based on the pose transformation parameters to obtain the transformed second head image. Finally, the electronic device replaces the first head image in the first mask image with the transformed second head image, stitches the transformed second face onto the replaced second head image of the first mask image, and smooths the stitching position of the stitched image to obtain the target image. Thus, there is no need to construct a 3D facial image; the method is simple, efficient, and fast, meeting real-time requirements. Simultaneously, the generated face-swapped image is sufficiently natural, difficult to distinguish from the real face when the pose transformation is small, and exhibits excellent results. Furthermore, it can simultaneously change face shape and hairstyle, making the face-swapped image more complete. The thin-plate spline interpolation method ensures that local facial distortion does not affect the overall aesthetics. Moreover, the image output by the super-resolution model has sufficiently high resolution to meet the needs of audio and video call applications.

[0093] It should be noted that the descriptions of the same steps and contents as in other embodiments in this embodiment can be found in the descriptions in other embodiments, and will not be repeated here.

[0094] See Figure 2 and Figure 6 , Figure 6 This is a flowchart illustrating an optional face replacement method provided in an embodiment of this application. The face replacement method is applied to an electronic device and includes the following steps:

[0095] Step 301: Input the driving image and the target image.

[0096] Step 302: Perform image segmentation on the driving image using a preset image segmentation model with a predetermined region to obtain the first mask image corresponding to the driving image; perform image segmentation on the target image using a preset image segmentation model with a predetermined region to obtain the second mask image corresponding to the target image.

[0097] The designated areas include the background area, clothing area, skin area, face area, and hair area.

[0098] Step 303: Using a cascaded pose regression framework, face alignment and corresponding feature point extraction are performed on the driving image and the target image respectively, to obtain multiple face feature points of the first face in the driving image and multiple face feature points of the second face in the target image.

[0099] In this embodiment, the execution order of steps 302 and 303 can be random. For example, step 302 can be executed before step 303 or after step 303. Of course, in this embodiment, the execution order of steps 302 and 303 can also be executed simultaneously. This application does not make specific limitations on this.

[0100] Step 304: Using a pre-established face pose estimation model, perform pose estimation processing on the first and second position parameters of multiple face feature points to obtain the pose transformation parameters between the face feature points of the first and second faces.

[0101] Step 305: Based on the pose transformation parameters, transform the second face to obtain the transformed second face.

[0102] Step 306: Based on the pose transformation parameters, perform pose transformation processing on the second head image in the second mask image to obtain the transformed second head image.

[0103] Step 307: Input the transformed second face, the transformed second head image and the first mask image into the generation model to obtain the face and the target image after head replacement corresponding to the driving image output by the generation model.

[0104] Step 308: Process the resolution of the target image using a super-resolution model to obtain a high-resolution target image, and output the high-resolution target image.

[0105] As described above, in this embodiment, the electronic device does not need to construct a 3D facial image. The method is simple, efficient, and fast, meeting real-time requirements. Simultaneously, the generated face-swapped image is sufficiently natural, difficult to distinguish from a real face when pose changes are minor, and exhibits excellent results. Furthermore, it can simultaneously change face shape and hairstyle, making the overall face-swapped image more complete. The thin-plate spline interpolation method ensures that local facial distortion does not affect the overall aesthetics. Moreover, the image output by the super-resolution model has sufficiently high resolution to meet the needs of audio and video call applications.

[0106] Based on the foregoing embodiments, this application provides a face replacement device that can be applied to... Figure 1 , Figures 3-5 In one of the corresponding face replacement methods provided, refer to Figure 7 As shown, the face replacement device 7 includes:

[0107] The first processing module 71 is used to perform image segmentation on the acquired driving image using a preset image segmentation model to obtain the first mask image;

[0108] The first processing module 71 is also used to perform image segmentation on the acquired specific image using a preset image segmentation model to obtain a second mask image;

[0109] The acquisition module 72 is used to acquire the pose transformation parameters of the first face in the driving image and the second face in a specific image;

[0110] The second processing module 73 is used to perform transformation processing on the second face based on the pose transformation parameters to obtain the transformed second face, wherein the face pose in the transformed second face is the same as the face pose in the first face.

[0111] The second processing module 73 is further configured to generate a target image with face replacement corresponding to the driving image based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image.

[0112] In other embodiments of this application, the second processing module 73 is further configured to determine the first position parameter of the i-th face feature point among multiple face feature points in the first face and the second position parameter of the i-th face feature point among multiple face feature points in the second face through a preset face feature point model; and to perform pose estimation processing on the first position parameter and the second position parameter of multiple face feature points through a pre-established face pose estimation model to obtain the pose transformation parameters between the face feature points of the first face and the second face.

[0113] In other embodiments of this application, the second processing module 73 is further configured to perform pose transformation processing on the second face based on pose transformation parameters to obtain an initial transformed face; fit the two-dimensional transformation between the current position of multiple feature points in the initial transformed face and the preset target position corresponding to the multiple feature points through an interpolation function to obtain the deformation of the second face; and perform difference transformation processing on the initial transformed face through an image difference algorithm based on the deformation to obtain the transformed second face.

[0114] In other embodiments of this application, the second processing module 73 is further configured to perform pose transformation processing on the second head image in the second mask image based on pose transformation parameters to obtain a transformed second head image, wherein the head pose in the transformed second head image is the same as the head pose of the first head image in the first mask image; and generate a target image based on the transformed second face, the transformed second head image and the first mask image.

[0115] In other embodiments of this application, the second processing module 73 is further configured to replace the first head image in the first mask image with the transformed second head image to obtain the replaced first mask image; and to fuse the transformed second face onto the second head image of the replaced first mask image to obtain the target image.

[0116] In other embodiments of this application, the second processing module 73 is further configured to stitch the transformed second face onto the second head image of the replaced first mask image to obtain a stitched image; and to smooth the stitching position of the stitched image to obtain a target image.

[0117] In other embodiments of this application, the second processing module 73 is further configured to process the resolution of the target image through a super-resolution model to obtain a high-resolution target image, and the output module is configured to output the high-resolution target image.

[0118] Based on the foregoing embodiments, this application provides an electronic device that can be applied to... Figure 1 , Figures 3-5 In one of the corresponding face replacement methods provided, refer to Figure 8 As shown, the electronic device 8 ( Figure 8 Electronic device 8 in the middle corresponds to Figure 7 The face replacement device 7 in the electronic device 8 includes: a memory 81 and a processor 82, wherein the processor 82 is used to execute the face replacement program stored in the memory 81, and the electronic device 8 implements the following steps through the processor 82:

[0119] The acquired driving image is segmented using a preset image segmentation model to obtain the first mask image;

[0120] A second mask image is obtained by segmenting a specific image using a preset image segmentation model.

[0121] Obtain the pose transformation parameters of the first face in the driving image and the second face in a specific image;

[0122] Based on the pose transformation parameters, the second face is transformed to obtain the transformed second face, wherein the pose of the face in the transformed second face is the same as the pose of the face in the first face.

[0123] Based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image, a target image with face replacement corresponding to the driving image is generated.

[0124] In other embodiments of this application, the processor 82 is used to execute a face replacement program stored in the memory 81 to perform the following steps:

[0125] Using a pre-defined facial feature point model, the first position parameter of the i-th facial feature point among multiple facial feature points in the first face and the second position parameter of the i-th facial feature point among multiple facial feature points in the second face are determined. Using a pre-established facial pose estimation model, the first and second position parameters of multiple facial feature points are processed for pose estimation to obtain the pose transformation parameters between the facial feature points of the first and second faces.

[0126] In other embodiments of this application, the processor 82 is used to execute a face replacement program stored in the memory 81 to perform the following steps:

[0127] Based on the pose transformation parameters, the pose transformation process is performed on the second face to obtain the initial transformed face; the two-dimensional transformation between the current position of multiple feature points in the initial transformed face and the preset target position corresponding to the multiple feature points is fitted by the interpolation function to obtain the deformation of the second face; based on the deformation, the initial transformed face is subjected to interpolation transformation processing by the image interpolation algorithm to obtain the transformed second face.

[0128] In other embodiments of this application, the processor 82 is used to execute a face replacement program stored in the memory 81 to perform the following steps:

[0129] Based on the pose transformation parameters, the pose transformation process is performed on the second head image in the second mask image to obtain the transformed second head image, wherein the head pose in the transformed second head image is the same as the head pose in the first head image in the first mask image; based on the transformed second face, the transformed second head image and the first mask image, the target image is generated.

[0130] In other embodiments of this application, the processor 82 is used to execute a face replacement program stored in the memory 81 to perform the following steps:

[0131] Replace the first head image in the first mask image with the transformed second head image to obtain the replaced first mask image; then merge the transformed second face onto the replaced second head image of the first mask image to obtain the target image.

[0132] In other embodiments of this application, the processor 82 is used to execute a face replacement program stored in the memory 81 to perform the following steps:

[0133] The transformed second face is stitched onto the second head image of the replaced first mask image to obtain a stitched image; the stitching position of the stitched image is smoothed to obtain the target image.

[0134] In other embodiments of this application, the processor 82 is used to execute a face replacement program stored in the memory 81 to perform the following steps:

[0135] The resolution of the target image is processed by a super-resolution model to obtain a high-resolution target image, and then the high-resolution target image is output.

[0136] This application provides a computer-readable storage medium storing one or more programs that can be executed by one or more processors to perform the following steps:

[0137] The acquired driving image is segmented using a preset image segmentation model to obtain the first mask image;

[0138] A second mask image is obtained by segmenting a specific image using a preset image segmentation model.

[0139] Obtain the pose transformation parameters of the first face in the driving image and the second face in a specific image;

[0140] Based on the pose transformation parameters, the second face is transformed to obtain the transformed second face, wherein the pose of the face in the transformed second face is the same as the pose of the face in the first face.

[0141] Based on the pose transformation parameters, the transformed second face, the first mask image, and the second mask image, a target image with face replacement corresponding to the driving image is generated.

[0142] In other embodiments of this application, the one or more programs may be executed by one or more processors, and may also perform the following steps:

[0143] Using a pre-defined facial feature point model, the first position parameter of the i-th facial feature point among multiple facial feature points in the first face and the second position parameter of the i-th facial feature point among multiple facial feature points in the second face are determined. Using a pre-established facial pose estimation model, the first and second position parameters of multiple facial feature points are processed for pose estimation to obtain the pose transformation parameters between the facial feature points of the first and second faces.

[0144] In other embodiments of this application, the one or more programs may be executed by one or more processors, and may also perform the following steps:

[0145] Based on the pose transformation parameters, the pose transformation process is performed on the second face to obtain the initial transformed face; the two-dimensional transformation between the current position of multiple feature points in the initial transformed face and the preset target position corresponding to the multiple feature points is fitted by the interpolation function to obtain the deformation of the second face; based on the deformation, the initial transformed face is subjected to interpolation transformation processing by the image interpolation algorithm to obtain the transformed second face.

[0146] In other embodiments of this application, the one or more programs may be executed by one or more processors, and may also perform the following steps:

[0147] Based on pose transformation parameters, pose transformation processing is performed on the second head image in the second mask image to obtain a transformed second head image, wherein the head pose in the transformed second head image is the same as the head pose in the first head image in the first mask image. Based on the transformed second face, the transformed second head image, and the first mask image, a target image is generated.

[0148] In other embodiments of this application, the one or more programs may be executed by one or more processors, and may also perform the following steps:

[0149] Replace the first head image in the first mask image with the transformed second head image to obtain the replaced first mask image; then merge the transformed second face onto the replaced second head image of the first mask image to obtain the target image.

[0150] In other embodiments of this application, the one or more programs may be executed by one or more processors, and may also perform the following steps:

[0151] The transformed second face is stitched onto the second head image of the replaced first mask image to obtain a stitched image; the stitching position of the stitched image is smoothed to obtain the target image.

[0152] In other embodiments of this application, the one or more programs may be executed by one or more processors, and may also perform the following steps:

[0153] The resolution of the target image is processed by a super-resolution model to obtain a high-resolution target image, and then the high-resolution target image is output.

[0154] It should be noted that the aforementioned computer storage media / memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), magnetic random access memory (FRAM), flash memory, magnetic surface memory, optical disc, or compact disc read-only memory (CD-ROM), etc.; it can also be various terminals that include one or any combination of the above-mentioned memory, such as mobile phones, computers, tablet devices, personal digital assistants, etc.

[0155] In the several embodiments provided in this application, it should be understood that the disclosed devices and methods can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of units is only a logical functional division, and in actual implementation, there may be other division methods, such as: multiple units or components can be combined, or integrated into another system, or some features can be ignored or not executed. In addition, the coupling, direct coupling, or communication connection between the various components shown or discussed can be through some interfaces, and the indirect coupling or communication connection between devices or units can be electrical, mechanical, or other forms.

[0156] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the units may be selected to achieve the purpose of this embodiment according to actual needs.

[0157] Furthermore, in the various embodiments of this application, all functional units can be integrated into one processing module, or each unit can be a separate unit, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or in a combination of hardware and software functional units. Those skilled in the art will understand that all or part of the steps of the above method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it performs the steps of the above method embodiments. The aforementioned storage medium includes various media capable of storing program code, such as mobile storage devices, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0158] The methods disclosed in the several method embodiments provided in this application can be arbitrarily combined without conflict to obtain new method embodiments.

[0159] The features disclosed in the several product embodiments provided in this application can be arbitrarily combined without conflict to obtain new product embodiments.

[0160] The features disclosed in the several method or device embodiments provided in this application can be arbitrarily combined without conflict to obtain new method or device embodiments.

[0161] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A face replacement method, characterized in that, The method includes: The acquired driving image is segmented using a preset image segmentation model to obtain the first mask image; The acquired specific image is segmented using the preset image segmentation model to obtain a second mask image; wherein, the mask image is an image composed of the outer contours of each region obtained after segmenting multiple different predetermined regions within the input image, and the input image includes the driving image and the specific image; The pose transformation parameters of a first face in the driving image and a second face in the specific image are obtained; wherein the pose transformation parameters are determined based on the relative position parameters between the same facial feature points between the first face and the second face; Based on the pose transformation parameters, the second face is transformed to obtain the transformed second face, wherein the pose of the face in the transformed second face is the same as the pose of the face in the first face; Based on the pose transformation parameters, the pose transformation processing is performed on the second head image in the second mask image to obtain the transformed second head image, wherein the head pose in the transformed second head image is the same as the head pose of the first head image in the first mask image; the second head image includes a face region image and a hair region image; The first head image in the first mask image is replaced with the transformed second head image to obtain the replaced first mask image; The transformed second face is fused onto the second head image of the replaced first mask image to generate the target image with the replaced face corresponding to the driving image.

2. The method according to claim 1, characterized in that, The step of obtaining the pose transformation parameters of the first face in the driving image and the second face in the specific image includes: By using a preset facial feature point model, the first position parameter of the i-th facial feature point among multiple facial feature points in the first face and the second position parameter of the i-th facial feature point among multiple facial feature points in the second face are determined. By using a pre-established face pose estimation model, pose estimation processing is performed on the first position parameters and the second position parameters of the multiple face feature points to obtain the pose transformation parameters between the face feature points of the first face and the second face.

3. The method according to claim 1, characterized in that, The step of transforming the second face based on the pose transformation parameters to obtain the transformed second face includes: Based on the pose transformation parameters, the second face is subjected to pose transformation processing to obtain an initial transformed face; The deformation of the second face is obtained by fitting the two-dimensional transformation between the current position of multiple feature points in the initial transformed face and the preset target position corresponding to the multiple feature points using an interpolation function. Based on the aforementioned deformation, the initial transformed face is subjected to interpolation transformation processing using an image interpolation algorithm to obtain the transformed second face.

4. The method according to claim 1, characterized in that, The step of fusing the transformed second face onto the second head image of the replaced first mask image to obtain the target image includes: The transformed second face is stitched onto the second head image of the replaced first mask image to obtain a stitched image; The stitching positions of the stitched images are smoothed to obtain the target image.

5. The method according to any one of claims 1 to 4, characterized in that, The method includes: The resolution of the target image is processed by a super-resolution model to obtain a high-resolution target image, and the high-resolution target image is then output.

6. A face replacement device, characterized in that, The device includes: The first processing module is used to perform image segmentation on the acquired driving image using a preset image segmentation model to obtain the first mask image; The first processing module is further configured to perform image segmentation on the acquired specific image using the preset image segmentation model to obtain a second mask image; wherein, the mask image is an image composed of the outer contours of each region obtained after segmenting multiple different predetermined regions within the input image, and the input image includes the driving image and the specific image; The acquisition module is used to acquire the pose transformation parameters of a first face in the driving image and a second face in the specific image; wherein the pose transformation parameters are determined based on the relative position parameters between the same facial feature points between the first face and the second face; The second processing module is used to transform the second face based on the pose transformation parameters to obtain the transformed second face, wherein the face pose in the transformed second face is the same as the face pose in the first face. The second processing module is further configured to perform pose transformation processing on the second head image in the second mask image based on the pose transformation parameters to obtain a transformed second head image, wherein the head pose in the transformed second head image is the same as the head pose of the first head image in the first mask image; the second head image includes a face region image and a hair region image; replace the first head image in the first mask image with the transformed second head image to obtain a replaced first mask image; and fuse the transformed second face onto the second head image of the replaced first mask image to generate the target image with the replaced face corresponding to the driving image.

7. An electronic device, characterized in that, The electronic device includes: Memory, used to store executable instructions; A processor is configured to execute executable instructions stored in the memory to implement the face replacement method as described in any one of claims 1 to 5.

8. A computer storage medium, characterized in that, The computer storage medium stores one or more programs, which can be executed by one or more processors to implement the face replacement method as described in any one of claims 1 to 5.