Speech synthesis method and device, electronic equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech synthesis and pronunciation technology, applied in the field of artificial intelligence, can solve the problems of memory loss, unable to learn long-distance dependencies of words, unable to express the accurate and true semantics of input text, and achieve the effect of improving accuracy.

Active Publication Date: 2020-02-11

TENCENT TECH (SHENZHEN) CO LTD

View PDF5 Cites 39 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, the RNN language model has the problem of memory attenuation, and cannot learn the long-distance dependencies between words in a long input text, which leads to the fact that the vector sequence generated based on the RNN language model cannot represent the accurate and true semantics of the input text. Reduced the accuracy of subsequent prosodic structure prediction and pronunciation prediction, affecting the final synthesized speech effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0052] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application.

[0053] For the convenience of understanding, the nouns involved in the embodiments of the present application are explained below:

[0054]Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to hum...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech synthesis method and device, electronic equipment and a storage medium, relates to the artificial intelligence technology, and uses the machine learning technology inartificial intelligence to carry out speech synthesis, and the method comprises the following steps: obtaining a word segmentation sequence corresponding to an input text; determining a vector sequence corresponding to the word segmentation sequence by utilizing a trained attention mechanism-based language representation model; carrying out rhythm structure prediction processing on the vector sequence by utilizing a rhythm structure model, and determining rhythm structure information which comprises pause duration and pronunciation intensity corresponding to word segmentation fragments corresponding to each feature vector in the vector sequence in the synthesized speech; and synthesizing speech corresponding to the input text according to the rhythm structure information and pronunciationcorresponding to each word segmentation fragment in the word segmentation sequence. According to the speech synthesis method and device, the electronic equipment and the storage medium provided by theinvention, the accurate and real semantics of the input text can be expressed based on the vector sequence obtained by the language representation model, so that the synthesized speech is more natural.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a speech synthesis method, device, electronic equipment and storage medium. Background technique [0002] Speech synthesis technology (Text-to-Speech, TTS) is a technology that converts text input into speech output. It specifically involves multiple fields such as acoustics, linguistics, and digital signal processing multimedia technology. It has been widely used in human-computer interaction. [0003] At present, generally based on the RNN (Recurrent Neural Network, recurrent neural network) language model to generate a vector sequence that can represent the contextual semantics of the input text, and then perform prosodic structure prediction and pronunciation prediction based on the vector sequence, so as to predict the result based on the prosodic structure Synthesize speech with the pronunciation prediction result. [0004] However, the RNN langua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/08G10L13/10G10L15/18G06F40/284G06F40/58

CPCG10L13/08G10L13/10G10L15/1807G10L15/1822Y02T10/40

Inventor杨兵陈凌辉钟佳琪

OwnerTENCENT TECH (SHENZHEN) CO LTD

Speech synthesis method and device, electronic equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology