Speech synthesis method and device, electronic equipment and storage medium

A speech synthesis and pronunciation technology, applied in the field of artificial intelligence, can solve the problems of memory loss, unable to learn long-distance dependencies of words, unable to express the accurate and true semantics of input text, and achieve the effect of improving accuracy.

Active Publication Date: 2020-02-11
TENCENT TECH (SHENZHEN) CO LTD
View PDF5 Cites 39 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the RNN language model has the problem of memory attenuation, and cannot learn the long-distance dependencies between words in a long input text, which leads to the fact that the vector sequence generated based on

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis method and device, electronic equipment and storage medium
  • Speech synthesis method and device, electronic equipment and storage medium
  • Speech synthesis method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application.

[0053] For the convenience of understanding, the nouns involved in the embodiments of the present application are explained below:

[0054]Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to hum...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a speech synthesis method and device, electronic equipment and a storage medium, relates to the artificial intelligence technology, and uses the machine learning technology inartificial intelligence to carry out speech synthesis, and the method comprises the following steps: obtaining a word segmentation sequence corresponding to an input text; determining a vector sequence corresponding to the word segmentation sequence by utilizing a trained attention mechanism-based language representation model; carrying out rhythm structure prediction processing on the vector sequence by utilizing a rhythm structure model, and determining rhythm structure information which comprises pause duration and pronunciation intensity corresponding to word segmentation fragments corresponding to each feature vector in the vector sequence in the synthesized speech; and synthesizing speech corresponding to the input text according to the rhythm structure information and pronunciationcorresponding to each word segmentation fragment in the word segmentation sequence. According to the speech synthesis method and device, the electronic equipment and the storage medium provided by theinvention, the accurate and real semantics of the input text can be expressed based on the vector sequence obtained by the language representation model, so that the synthesized speech is more natural.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a speech synthesis method, device, electronic equipment and storage medium. Background technique [0002] Speech synthesis technology (Text-to-Speech, TTS) is a technology that converts text input into speech output. It specifically involves multiple fields such as acoustics, linguistics, and digital signal processing multimedia technology. It has been widely used in human-computer interaction. [0003] At present, generally based on the RNN (Recurrent Neural Network, recurrent neural network) language model to generate a vector sequence that can represent the contextual semantics of the input text, and then perform prosodic structure prediction and pronunciation prediction based on the vector sequence, so as to predict the result based on the prosodic structure Synthesize speech with the pronunciation prediction result. [0004] However, the RNN langua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/08G10L13/10G10L15/18G06F40/284G06F40/58
CPCG10L13/08G10L13/10G10L15/1807G10L15/1822Y02T10/40
Inventor 杨兵陈凌辉钟佳琪
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products