Voice-driven 3D virtual human expression voice and picture synchronization method and system based on deep learning
Patent Information
- Authority / Receiving Office
- CN · China
- Current Assignee / Owner
- 超维视界(北京)传媒科技有限公司
- Publication Date
- 2020-11-27
Smart Images

Figure 1 
Figure 2 
Figure 3
Abstract
Description
technical field
[0001] The invention relates to the fields of computer graphics, computer vision, speech recognition, speech synthesis, etc., and specifically relates to a method of using a deep neural network to fit the relationship between speech and 3D model Blend Shape values, and to realize the synchronization of speech-driven 3D virtual human expression, sound and picture methods and systems. Background technique
[0002] At present, there are several types of voice-driven methods for generating virtual human facial animations:
[0003] (1) Speech generates the vertex coordinates of a 3D model with a fixed topology through the neural network, and these vertex coordinates can show facial animation on the DI4D PRO system.
[0004] (2) Speech drives the avatar through the confrontation network to generate different 2D images, which are reflections of different angles of a 3D model.
[0005] (3) Speech is split by phonemes, and each phoneme corresponds to an animation cl...