Mobile speech synthesis method

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A speech synthesis and speech technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of complex primitive search algorithm, limited storage space and computing power of mobile terminals, and cannot fully meet the needs of mobile terminals.

Inactive Publication Date: 2006-02-08

TSINGHUA UNIV

View PDF2 Cites 50 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Due to the limited storage space and computing power of mobile terminals, the general-purpose speech synthesis method based on large-scale speech corpora requires a large-scale sound library, and the primitive search algorithm is also relatively complicated, which cannot fully meet the needs of mobile terminals.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment approach

[0198] Step 1: Prepare a large-scale corpus, which includes more than 5,000 Chinese sentences taken from the People’s Daily. Each sentence includes text, pinyin, prosodic level annotation, Mandarin recording data with 16K sampling rate and 16bit precision, syllable segmentation annotation, Fundamental frequency notation.

[0199] Step 2: Extract feature values, including each syllable, PosInPhrase, PosInWord, PosInPhrase, PosInSentence, PreTone, PostTone, LeftPhone, RightPhone, RightPhoneType, LeftPhoneType, a total of nine feature values, as well as duration, energy, fundamental frequency curve, waveform data.

[0200] The following two steps take the primitive "shi4" as an example to illustrate the process of sound bank construction and prosodic model training.

[0201] Step 3: "shi4" has a total of 1166 samples in the corpus. According to the eigenvector composed of the duration D, energy U, and fundamental frequency vector P of each sample, the Mahalanobis distance betwe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a moving voice synthetic method in the field of information changing and processing technology between Chinese character and voice. It is characterized in that it relates to a moving end device especially to an intelligent handset to do words changing. It comprises a voice base structure of the voice synthetic system, a rhythm module practice and a synthetic method and so on. It comprises a CART-Classification and Regression Trees method to choose the base unit sample from the large scale voice data base and to quickly establish the voice base unit database which is suit for the moving end; the rhythm module practice method based on the large scale voice database can extract the base audio curve from nature voice to generate the rhythm platen. So that it can achieve the changing method and system from text to voice on the moving end.

Description

technical field [0001] The mobile voice synthesis method belongs to the technical field of information conversion and processing between text and voice in communication. The invention relates to a technology for text-to-speech conversion on a mobile terminal device, especially a smart phone. Background technique [0002] Text-to-Speech (TTS) is a technology that converts text into sound, and is often called speech synthesis. At present, mainstream text-to-speech systems mostly use waveform splicing synthesis methods based on large-scale speech corpora. In order to obtain high-quality synthesized speech, this type of synthesis system often requires a large-scale speech database. A sound library can easily cost hundreds of MB, and sound libraries reaching the GB level are already very common; at the same time, with the development of information technology , a variety of mobile terminal devices such as mobile phones, personal digital assistants (PDAs), vehicle-mounted device...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/00G10L13/02G10L13/08

Inventor蔡莲红叶振兴倪昕黄德智

OwnerTSINGHUA UNIV

Mobile speech synthesis method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology