Voice database structure compression used for embedded voice synthesis system and use method thereof

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech synthesis and embedded technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve the problems of difficulty in meeting user needs, high cost, naturalness of synthesized speech and degradation of sound quality.

Inactive Publication Date: 2011-09-28

北京宇音天下科技有限公司

View PDF6 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Also, with tighter control over clustering, the naturalness and quality of the synthesized speech degrades significantly

The above system is still expensive for devices with limited resources, and it is difficult to meet the needs of users

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0044] in the attached figure 1Among them, in an embodiment of the present invention, the speech synthesis system of the present invention is deployed in a kind of embedded operating system, and this embedded speech synthesis system comprises: model training part (11), text input module (103), speech synthesis part (102) and a voice signal output module (104).

[0045] Wherein, the speech synthesis model training part (11) is only used offline in the system, and is only used to generate the compressed model library (12) required for the speech synthesis system to work. Wherein training voice storehouse (7) comprises the original voice of recording, by training voice storehouse

[0046] (7) The process of generating the compressed model library (12) offline includes: HMM model training step (8), model structured compression step (9) and model secondary compression (90).

[0047] In step (8), first use the speech recognition toolkit HTK to automatically segment the recorded or...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice database composition compression used for an embedded voice synthesized system and a use method. The voice database composition compression is used for an embedded system and can transform any word received into a corresponding voice and output the voice. The syllables in Chinese are taken as the basic units of the synthesized system and a voice model database; firstly an original voice model database is established based on the syllables, then the original voice model database is structurally compressed to obtain a final compressed model database. According to the method provided in the invention, the space resource taken by the synthesized system under the embedded platform can be reduced, the synthesis speed is quickened and the naturalness and phonetic quality of the synthesized voice are maintained well.

Description

technical field [0001] The present invention generally relates to a method for compressing and using a structured sound library for an embedded speech synthesis system, especially for terminal equipment with limited storage and computing resources. Background technique [0002] The purpose of speech synthesis technology is to allow machines to restore natural human speech. Embedded devices are widely used, and terminal embedded devices frequently interact with users, and speech is the most natural means of interaction. A general speech synthesis system can be divided into three main functional modules: text analysis module, prosody generation module and acoustic synthesis module. The splicing and synthesis method based on large-scale corpus is widely used because of its simple technology and high quality of synthesized sound. However, this method has a large sound bank, and although the space can be reduced by clustering, encoding, and compression, the sound quality is dama...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/02G10L13/08G10L15/14

Inventor那兴宇谢湘何娅玲何宇新

Owner北京宇音天下科技有限公司

Voice database structure compression used for embedded voice synthesis system and use method thereof

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology