Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice database structure compression used for embedded voice synthesis system and use method thereof

A speech synthesis and embedded technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve the problems of difficulty in meeting user needs, high cost, naturalness of synthesized speech and degradation of sound quality.

Inactive Publication Date: 2011-09-28
北京宇音天下科技有限公司
View PDF6 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Also, with tighter control over clustering, the naturalness and quality of the synthesized speech degrades significantly
The above system is still expensive for devices with limited resources, and it is difficult to meet the needs of users

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice database structure compression used for embedded voice synthesis system and use method thereof
  • Voice database structure compression used for embedded voice synthesis system and use method thereof
  • Voice database structure compression used for embedded voice synthesis system and use method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] in the attached figure 1Among them, in an embodiment of the present invention, the speech synthesis system of the present invention is deployed in a kind of embedded operating system, and this embedded speech synthesis system comprises: model training part (11), text input module (103), speech synthesis part (102) and a voice signal output module (104).

[0045] Wherein, the speech synthesis model training part (11) is only used offline in the system, and is only used to generate the compressed model library (12) required for the speech synthesis system to work. Wherein training voice storehouse (7) comprises the original voice of recording, by training voice storehouse

[0046] (7) The process of generating the compressed model library (12) offline includes: HMM model training step (8), model structured compression step (9) and model secondary compression (90).

[0047] In step (8), first use the speech recognition toolkit HTK to automatically segment the recorded or...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice database composition compression used for an embedded voice synthesized system and a use method. The voice database composition compression is used for an embedded system and can transform any word received into a corresponding voice and output the voice. The syllables in Chinese are taken as the basic units of the synthesized system and a voice model database; firstly an original voice model database is established based on the syllables, then the original voice model database is structurally compressed to obtain a final compressed model database. According to the method provided in the invention, the space resource taken by the synthesized system under the embedded platform can be reduced, the synthesis speed is quickened and the naturalness and phonetic quality of the synthesized voice are maintained well.

Description

technical field [0001] The present invention generally relates to a method for compressing and using a structured sound library for an embedded speech synthesis system, especially for terminal equipment with limited storage and computing resources. Background technique [0002] The purpose of speech synthesis technology is to allow machines to restore natural human speech. Embedded devices are widely used, and terminal embedded devices frequently interact with users, and speech is the most natural means of interaction. A general speech synthesis system can be divided into three main functional modules: text analysis module, prosody generation module and acoustic synthesis module. The splicing and synthesis method based on large-scale corpus is widely used because of its simple technology and high quality of synthesized sound. However, this method has a large sound bank, and although the space can be reduced by clustering, encoding, and compression, the sound quality is dama...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L13/08G10L15/14
Inventor 那兴宇谢湘何娅玲何宇新
Owner 北京宇音天下科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products