Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

High-quality face voice driving method based on neural radiation field

A driving method and radiation field technology, applied in the field of face image processing, can solve the problems of unrealistic effects and achieve the effect of improving accuracy

Active Publication Date: 2021-06-01
UNIV OF SCI & TECH OF CHINA
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the method still uses the generative confrontation network as the intermediate mapping, but with the help of the editable 3D face model, the result is relatively stable; however, in terms of maintaining the details of the original target face such as lighting, wrinkles and background fusion, it reaches Less than realistic effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-quality face voice driving method based on neural radiation field
  • High-quality face voice driving method based on neural radiation field
  • High-quality face voice driving method based on neural radiation field

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0038] In the field of voice-driven face video generation, the traditional manual modeling method works well, but it relies on professional skills, takes a long time and the final effect depends on the personal level of the modeling engineer; the two-dimensional image-based generative confrontation network The model requires a large-scale paired data set, which is difficult to train and the quality of the effect is unstable.

[0039] For this reason, the present invention discloses a high-quality human face voice driving method based on a neural radiation field. According to a short human face speech video (three to five minutes), the method respectively controls the human face and upper body torso in the video. Two diffe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a high-quality face voice driving method based on a neural radiation field, and the method comprises the following steps: carrying out the feature extraction on video-synchronized voice information through a text-based voice recognition model, and obtaining the extracted voice features; segmenting an initial face speaking video set frame by frame; estimating attitude information of each frame of face by using a pre-trained three-dimensional face reconstruction model; learning a neural radiation field model for a target picture by using a multi-layer perceptron; and taking the voice features as condition information, and generating a picture under the current visual angle and voice condition by using a neural rendering mode. A face speaking model trained based on the neural radiation field has the capability of implicitly representing three-dimensional face displacement, including rigid and non-rigid motion. The neural rendering supports sampling setting of different ray angles and different densities, so that the generated face speaking video has the characteristics of high quality and stability.

Description

technical field [0001] The invention relates to the technical field of face image processing, in particular to a high-quality face voice driving method based on a neural radiation field. Background technique [0002] With the development of technology in the field of image processing in recent years, artificial intelligence-based digital humans are in great demand in applications such as remote video conferencing, virtual character generation, and animation video creation. How to construct realistic and high-quality virtual characters has become a widely concerned issue. Among them, the use of arbitrary input speech signals to drive the target face and generate natural speaking video sequences is a core application. [0003] In the past, there were mainly three methods for high-quality face-speech driving: manual modeling, by pre-modeling a series of mouth shapes of the target face, and then manually decomposing the speech signal into corresponding action sequences, so as t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H04N13/275H04N13/296G06K9/00G06K9/34G06N3/04G06N3/08G10L15/02G10L15/06G10L15/16G10L15/25G10L15/26H04N5/272
CPCH04N13/275H04N13/296H04N5/272G10L15/02G10L15/063G10L15/16G10L15/25G10L15/26G06N3/04G06N3/084H04N2005/2726G06V40/161G06V40/172G06V40/20G06V10/267
Inventor 张举勇郭玉东陈柯宇
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products