Voice synthesis method, model training method and device, and computer equipment

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech synthesis and model technology, applied in the fields of devices and computer equipment, speech synthesis methods, and model training methods, can solve problems affecting synthesized speech and achieve the effects of improving quality, reducing complexity, and improving accuracy

Active Publication Date: 2018-12-18

TENCENT TECH (SHENZHEN) CO LTD

View PDF3 Cites 49 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004]Based on this, it is necessary to provide a speech synthesis method and a model training method for the technical problem that the semantic features in the logarithmic mel spectrum affect the quality of synthesized speech , devices and computer equipment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0053] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0054] figure 1 It is an application environment diagram of the speech synthesis method and the model training method in an embodiment. refer to figure 1, the speech synthesis method and the model training method are applied to a speech synthesis system. The speech synthesis system includes a first encoder, a first decoder, a second encoder, a second decoder, a stacker, a residual model, a projection layer and the like. The internal relationship and signal flow among the various components in the speech synthesis system are as follows: figure 1 shown. Wherein, the fir...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a voice synthesis method, a model training method and device, and computer equipment. The method comprises the following steps: acquiring linguistic data to be processed; coding the linguistic data to obtain linguistic coded data; acquiring an embedded vector used for voice feature conversion, wherein the embedded vector is generated by the residual error between the reference synthetic voice data and the reference voice data corresponding to the same reference linguistic data; and decoding the linguistic code data according to the embedded vector so as to obtain the target synthesized voice data subjected to voice characteristic conversion. According to the scheme provided by the invention, the problem that the quality of the synthesized voice is influenced by thesemantic features in the logarithm Mel frequency spectrum can be avoided.

Description

technical field [0001] The present application relates to the technical field of speech synthesis, in particular to a speech synthesis method, model training method, device and computer equipment. Background technique [0002] With the continuous development of speech synthesis technology and computer technology, the application scenarios of voice interaction are becoming more and more extensive. Users can easily obtain various voice-related services through digital products, such as voice navigation through electronic maps in mobile phones, Listen to audio novels, etc. through reading software. [0003] For the synthesized voice, if the voice has human voice features, it will undoubtedly improve the user experience. To make the synthesized speech have human speech characteristics, the usual method is to use the logarithmic mel spectrum obtained by processing speech data as the input variable of the feature model to obtain the speech characteristics of the speaker, and then...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/04G10L13/047G10L19/04G10L19/16

CPCG10L13/04G10L13/047G10L19/04G10L19/16G10L13/033G06N3/08G06N3/044G06N3/045G10L13/00G10L19/02

Inventor吴锡欣王木康世胤苏丹俞栋

OwnerTENCENT TECH (SHENZHEN) CO LTD

Voice synthesis method, model training method and device, and computer equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology