Mobile speech synthesis method

A speech synthesis and speech technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of complex primitive search algorithm, limited storage space and computing power of mobile terminals, and cannot fully meet the needs of mobile terminals.

Inactive Publication Date: 2006-02-08
TSINGHUA UNIV
View PDF2 Cites 50 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Due to the limited storage space and computing power of mobile terminals, the general-purpose speech synthesis method based on large-scale speech corpora requires a large-scale sound library, and the primitive search algorithm is also relatively complicated, which cannot fully meet the needs of mobile terminals.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mobile speech synthesis method
  • Mobile speech synthesis method
  • Mobile speech synthesis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0198] Step 1: Prepare a large-scale corpus, which includes more than 5,000 Chinese sentences taken from the People’s Daily. Each sentence includes text, pinyin, prosodic level annotation, Mandarin recording data with 16K sampling rate and 16bit precision, syllable segmentation annotation, Fundamental frequency notation.

[0199] Step 2: Extract feature values, including each syllable, PosInPhrase, PosInWord, PosInPhrase, PosInSentence, PreTone, PostTone, LeftPhone, RightPhone, RightPhoneType, LeftPhoneType, a total of nine feature values, as well as duration, energy, fundamental frequency curve, waveform data.

[0200] The following two steps take the primitive "shi4" as an example to illustrate the process of sound bank construction and prosodic model training.

[0201] Step 3: "shi4" has a total of 1166 samples in the corpus. According to the eigenvector composed of the duration D, energy U, and fundamental frequency vector P of each sample, the Mahalanobis distance betwe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a moving voice synthetic method in the field of information changing and processing technology between Chinese character and voice. It is characterized in that it relates to a moving end device especially to an intelligent handset to do words changing. It comprises a voice base structure of the voice synthetic system, a rhythm module practice and a synthetic method and so on. It comprises a CART-Classification and Regression Trees method to choose the base unit sample from the large scale voice data base and to quickly establish the voice base unit database which is suit for the moving end; the rhythm module practice method based on the large scale voice database can extract the base audio curve from nature voice to generate the rhythm platen. So that it can achieve the changing method and system from text to voice on the moving end.

Description

technical field [0001] The mobile voice synthesis method belongs to the technical field of information conversion and processing between text and voice in communication. The invention relates to a technology for text-to-speech conversion on a mobile terminal device, especially a smart phone. Background technique [0002] Text-to-Speech (TTS) is a technology that converts text into sound, and is often called speech synthesis. At present, mainstream text-to-speech systems mostly use waveform splicing synthesis methods based on large-scale speech corpora. In order to obtain high-quality synthesized speech, this type of synthesis system often requires a large-scale speech database. A sound library can easily cost hundreds of MB, and sound libraries reaching the GB level are already very common; at the same time, with the development of information technology , a variety of mobile terminal devices such as mobile phones, personal digital assistants (PDAs), vehicle-mounted device...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/00G10L13/02G10L13/08
Inventor 蔡莲红叶振兴倪昕黄德智
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products