Model training method, method for synthesizing speaking expression and related device

A model training and facial expression technology, applied in the field of data processing, can solve the problems of virtual object expression change style limitation, excessive unnatural expression change, bad feeling, etc.

Active Publication Date: 2019-03-08
TENCENT TECH (SHENZHEN) CO LTD
View PDF3 Cites 61 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, what kind of expression this kind of virtual object makes is mainly determined according to the currently played pronunciation elements. As a result, when the virtual voice is played, the expression change style of the virtual object is limited, and the expression change is too unnatural. The reality is not good, and it is difficult to improve the user's immersion

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method, method for synthesizing speaking expression and related device
  • Model training method, method for synthesizing speaking expression and related device
  • Model training method, method for synthesizing speaking expression and related device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] In order to enable those skilled in the art to better understand the solutions of this application, the technical solutions in the embodiments of this application will be clearly and completely described below in conjunction with the drawings in the embodiments of this application. Obviously, the described embodiments are only It is a part of the embodiments of this application, but not all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

[0049] At present, when a virtual object interacts with a user, what kind of speech expression the virtual object makes is mainly determined according to the pronunciation element currently being played. For example, the corresponding relationship between the pronunciation element and the expression is established. In general, a pronunciation element Corresponding to a speak...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a model training method for synthesizing speaking expressions. Expression characteristics, acoustic characteristics and text characteristics are obtained according to videos containing face action expressions of speakers and corresponding voices. Because the acoustic feature and the text feature are obtained according to the same video, the time interval and the duration of the pronunciation element identified by the text feature are determined according to the acoustic feature. A first corresponding relation is determined according to the time interval and duration of the pronunciation element identified by the text feature and the expression feature, and an expression model is trained according to the first corresponding relation. The expressionmodel can determine different sub-expression characteristics for the same pronunciation element with different durations in the text characteristics; the change patterns of the speaking expressions are added, the speaking expressions generated according to the target expression characteristics is determined by the expression model. The speaking expressions have different change patterns for the same pronunciation element, and therefore the situation that the speaking expressions are excessively unnatural in change is improved to a certain degree.

Description

Technical field [0001] This application relates to the field of data processing, in particular to a model training method for synthesizing speech expressions, a method for synthesizing speech expressions, and related devices. Background technique [0002] With the development of computer technology, human-computer interaction has become more common, but it is mostly pure voice interaction. For example, the interactive device can determine the reply content according to the text or voice input by the user, and play a virtual sound synthesized according to the reply content. [0003] The user immersion brought by this type of human-computer interaction is difficult to meet the current user interaction needs. In order to improve user immersion, virtual objects with the ability to change expressions, such as changing mouth shapes, emerge as interactive objects with the user. This kind of virtual objects can be avatars such as cartoons and virtual people. When interacting with users, in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/00
CPCG06N3/006
Inventor 李廣之陀得意康世胤
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products