Video synthesis method based on lip language synchronization and miracle adaptation effect enhancement

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A video synthesis and lip language technology, applied in neural learning methods, image enhancement, speech analysis, etc., can solve the problem that the mouth shape and expression of the portrait are not obvious, the details of the mouth and expression are not processed, and the details of the mouth are not changed. Obvious and other problems to achieve the effect of avoiding face drift

Pending Publication Date: 2020-10-16

SHANDONG SYNTHESIS ELECTRONICS TECH

View PDF4 Cites 19 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The patent "Virtual anchor implementation method and device" (application number 201811320949.4) uses the speech synthesis model and the overall state synthesis model of the face to synthesize speech sequences and image sequences, but does not process the details of the mouth shape and expression, resulting in changes in the mouth shape and expression of the portrait not obvious

The patent "A Method of Audio and Video Synthesis" (application number 201910912787.1) uses a variational self-encoding network to achieve end-to-end audio and video synthesis, and further considers the relationship between the front and rear frame images, but there are still details such as mouth shapes that do not change significantly The problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0032]This embodiment discloses a video synthesis method based on lip synchronization and facial expression adaptation effect enhancement, such as figure 1 As shown, the main structure and data flow process of the video synthesis model of the present invention are shown. The model in the present invention is mainly composed of three parts or processes: input part, encoding-decoding part, and confrontation training part. You only need to use the first two parts when completing training and providing services.

[0033] The first component of the input part is the librosa package in the python three-party library, which processes the original audio file into a sequence of feature segments corresponding to frames. This feature and the original portrait image are the initial input content of the whole model. The face image is used as the basic element, and the same reference image is input in the loop synthesis process of each frame. Input in batches when training, and only a sing...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video synthesis method based on lip language synchronization and miracle adaptation effect enhancement. According to the method, the portrait and the audio stream to be synthesized are directly and integrally coded; a cyclic decoder network retaining original face information is used to decode converted abstract features into an image sequence, and then five discriminatornetworks are used to carry out adversarial training on the synthesized image sequence according to a real image sequence, so that a total reconstruction error is minimized. Compared with an existingvideo synthesis method, the method has the advantages that the continuity of face change between the front frame and the rear frame is guaranteed, the definition of face pictures in the frames is improved, meanwhile, under the action of the lip language synchronous discriminator and the miracle adaptation discriminator, the synthesized video appears more natural, and the authenticity of the visualeffect is greatly enhanced. The method has high practical value in the aspect of improving the user experience of virtual live broadcast and man-machine interaction.

Description

technical field [0001] The present invention relates to the field of artificial intelligence, in particular to the field of artificial intelligence virtual audio and video synthesis, in particular to a video synthesis method based on lip synchronization and expression adaptation effect enhancement. Background technique [0002] At present, with the continuous improvement of video quality captured by cameras and the rise of various online video platforms, the pressure on online video data storage is increasing day by day; in addition, due to the increasing number of online video viewing users, excellent web anchors are gradually in short supply. Case. In order to solve the above two difficulties, a product that synthesizes portrait video through text and audio data has emerged in the industry. Through this product, the video data is compressed into text and audio data for storage, and live webcasting instead of real people. However, the current virtual video synthesis method...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06K9/00G06N3/04G06N3/08G06T5/50G10L25/24G10L25/57

CPCG06N3/084G06T5/50G10L25/24G10L25/57G06V40/165G06V40/171G06V40/19G06N3/045Y02T10/40

Inventor王太浩张传锋朱锦雷

OwnerSHANDONG SYNTHESIS ELECTRONICS TECH

Video synthesis method based on lip language synchronization and miracle adaptation effect enhancement

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology