Speech synthesis method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech synthesis and synthesizer technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of difficult correspondence between acoustic feature sequences and long time delays, and achieve the effect of shortening synthesis time and improving efficiency

Active Publication Date: 2019-06-07

BEIJING GUANGNIAN WUXIAN SCI & TECH

View PDF7 Cites 25 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

There is no boundary information of phonetic symbols in the text (or phonetic symbols), which makes it very difficult to determine the sequence correspondence between the acoustic features and the text (phonetic symbols)

Moreover, the current acoustic training model can only generate acoustic features one time period at a time, and cannot make full use of the advantages of parallel computing, making its time delay very long

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0051] In order to make the object, technical solution and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0052] For clarity of expression, need to carry out following explanation before embodiment:

[0053] The smart device mentioned in the present invention supports multi-modal human-computer interaction, and has AI capabilities such as natural language understanding, visual perception, language and voice output, and emotional expression and action output; it can be configured with social attributes, personality attributes, character skills, etc., so that users can enjoy Intelligent and personalized smooth experience. In a specific embodiment, the smart device may be a story machine, a tablet, a watch, a picture book robot, a humanoid intelligent robot, and the like.

[0054] The smart device obtains the user's multi-modal data, and the server pe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speech synthesis method. The speech synthesis method comprises the steps that interactive instructions are acquired and analyzed, when the interactive instructions comprise aspeech broadcasting instruction, the speech broadcasting instruction is responded to; an acoustic model generated by a completely parallel attention mechanism is called, and to-be-synthesized text data or to-be-synthesized phonetic symbol data is input into the acoustic model in real time to obtain to-be-synthesized acoustic features; the to-be-synthesized acoustic features are input into an acoustic synthesizer, synthesized speech data is obtained through the acoustic synthesizer, and the speech data is output. The invention provides the end-to-end acoustic model and a training mode, which can completely utilize the advantages of parallel computing. The completely parallel attention mechanism and a convolution structure are utilized, so that the time required for generating the acoustic features is greatly shortened compared with the prior art, and the quality of the generated acoustic features can also be ensured, and therefore, under the situation of ensuring the quality of synthetic speech, the synthetic time is shortened, and the efficiency of the speech synthesis is improved.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a speech synthesis method and system. Background technique [0002] Speech synthesis, also known as text-to-speech (Text to Speech) technology, can convert any text information into a standard and smooth voice in real time. It involves many disciplines and technologies such as acoustics, linguistics, digital signal processing, and computer science. The main problem to be solved is how to convert text information into audible sound information. [0003] Currently, the data used for speech synthesis training generally consists of plain text (or phonetic symbols) and corresponding audio. There is no boundary information of phonetic symbols in text (or phonetic symbols), which makes it very difficult to determine the sequence correspondence between acoustic features and text (phonetic symbols). Moreover, the current acoustic training model can only generate acoustic features ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L13/08G10L19/00G10L25/30

Inventor 马达标陆羽皓

Owner BEIJING GUANGNIAN WUXIAN SCI & TECH

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech synthesis method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology