Audio data generating method and system for voice synthesis

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology for audio data and speech synthesis, applied in speech synthesis, speech analysis, instruments, etc., can solve problems such as compression, reduce computing delay, improve speed, and ensure the accuracy of acoustic feature prediction.

Active Publication Date: 2018-12-18

BEIJING GUANGNIAN WUXIAN SCI & TECH

View PDF6 Cites 16 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, at present, the delay of a large number of high-naturalness TTS interfaces is above 150ms, which severely compresses the processing time of the other two steps (ASR, NLP), and thus limits the complexity and accuracy of information processing in the other two steps. To improve the human-computer interaction experience, it is necessary to increase the speed of TTS or speech synthesis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035] In order to make the object, technical solution and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0036] figure 1 A flowchart of a method for generating audio data for speech synthesis according to an embodiment of the present invention is shown.

[0037] Such as figure 1 As shown, in step S101, text features in the text data are extracted to obtain text feature data. In one embodiment of the present invention, the text features include: one or a combination of phonetic symbols, intonation, sentence sentence or prosodic marking, syntactic dependency tree, participle marking, part-of-speech tagging, semantic weight and word vector.

[0038] In addition, the manner of obtaining the text feature data may be a natural language processing algorithm (NLP, Natural Language Processing). Natural language processing algorithms can perform word segm...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides an audio data generating method for voice synthesis. The audio data generating method comprises the following steps: extracting the text features in the text data to obtain textfeature data; carrying out the acceleration conversion processing on the text feature data through a neural network structure, and converting the text feature data into acoustic feature data; performing sound synthesis or selected splicing according to the acoustic feature data to obtain audio data. According to the invention, through the special anti-convolution structure is adopted, a good voice synthesis effect can be achieved on the premise that no any auto-regressive structure is included and few parameters are used, the calculation delay can be reduced while the prediction precision ofthe acoustic features can be guaranteed through the neural network structure, and the requirement for computing resources is reduced, the concurrent quantity is increased, the voice synthesis speed isincreased, and contribution is made to improvement of human-computer interaction experience.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a method and system for generating audio data for speech synthesis. Background technique [0002] For a voice-based real-time human-computer interaction system, in order to achieve the optimal human-computer interaction experience, the time from the end of the user's voice pronunciation to the start of the machine's voice reply is called "response time". In order to achieve the optimal human-computer interaction experience, the total time of these three steps should be around 600ms. For most voice human-computer interaction systems, the information processing process needs to go through three steps of ASR-NLP-TTS in sequence. However, at present, the delay of a large number of high-naturalness TTS interfaces is above 150ms, which severely compresses the processing time of the other two steps (ASR, NLP), and thus limits the complexity and accuracy of information processing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/02

CPCG10L13/02G10L2013/021

Inventor陆羽皓马达标

OwnerBEIJING GUANGNIAN WUXIAN SCI & TECH

Audio data generating method and system for voice synthesis

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology