Speech synthesis method and device, equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech synthesis and audio technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of single emotional expression of synthesized speech and inability to control speaking style independently.

Pending Publication Date: 2021-05-11

PING AN TECH (SHENZHEN) CO LTD

View PDF0 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] The main purpose of this application is to provide a speech synthesis method, device, computer equipment and computer-readable storage medium, aiming to solve the existing technical problems that the speech style cannot be controlled separately and the emotional expression of the synthesized speech is very single

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0026] The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0027] The flow charts shown in the drawings are just illustrations, and do not necessarily include all contents and operations / steps, nor must they be performed in the order described. For example, some operations / steps can be decomposed, combined or partly combined, so the actual order of execution may be changed according to the actual situation.

[0028] Embodiments of the present application provide a speech synthesis method, device, computer equipment, and co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a speech synthesis method and a device thereof, computer equipment and a computer readable storage medium, and the method comprises the steps: obtaining a to-be-processed text and a to-be-synthesized speech style audio, and inputting the to-be-processed text and the to-be-synthesized speech style audio into a preset speech synthesis model, encoding the speech style audio to be synthesized based on the multi-reference encoder, and obtaining style embedding vector information; encoding the to-be-processed text based on the text encoder to obtain text encoding vector information; splicing the style embedding vector information and the text coding vector information through the full connection layer to generate a Mel language spectrogram; and performing feature extraction on the Mel-language spectrogram through the output layer, and outputting a target audio of the to-be-processed text, thereby realizing control of the speaking style of the synthesized voice, and synthesizing the voice with more emotional expressions.

Description

technical field [0001] The present application relates to the technical field of speech semantics, in particular to a speech synthesis method, device, computer equipment and computer-readable storage medium. Background technique [0002] In the process of speech synthesis, not only the clarity and fluency of the synthesized speech must be considered, but also the prosodic information of the synthesized speech, so that the synthesized speech has rich emotional expression. When synthesizing speech, not only consider the smoothness of the sentence, but also consider changing the emotional state of the speaker, and use the model to learn the style information of the reference audio, so as to achieve a level comparable to the human voice. In the current prosodic model construction, the common method is to classify all speaking styles into one expression, and the speaking styles cannot be separated, so the speaking styles cannot be controlled separately, and the emotional expressi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/047G10L13/10G10L25/18G10L25/30

CPCG10L13/10G10L13/047G10L25/30G10L25/18

Inventor孙奥兰王健宗程宁

OwnerPING AN TECH (SHENZHEN) CO LTD

Speech synthesis method and device, equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology