Method and system for automatically generating voice with stressed syllables

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology for automatic generation and syllables, applied in speech synthesis, speech analysis, speech recognition, etc., can solve problems such as inefficiency and difficulty in ensuring the naturalness of generated speech

Inactive Publication Date: 2012-05-02

AISPEECH CO LTD

View PDF7 Cites 24 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This method is very inefficient and it is difficult to guarantee the naturalness of the generated speech

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0044] Below in conjunction with accompanying drawing and specific embodiment the present invention is described in further detail:

[0045] figure 1 A schematic diagram of the module structure of the system disclosed in the present invention is given.

[0046] A system device for automatically generating stressed syllable speech consists of 6 modules, which can be divided into two parts: the transformation parameter estimation part (training stage) and the stressed syllable speech automatic generation part.

[0047]Module 100 is a phoneme localization module, its function is to obtain the accurate time boundary of each phoneme in the input speech, and obtain the time boundary of each word, syllable and phoneme. Pre-train an acoustic model based on a Hidden Markov Model (HMM). If the input speech text is known, use the HMM model to obtain the time boundary of each phoneme by using Forced Alignment (Forced Alignment) technology; if the input speech text is unknown, use the HM...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a system for automatically generating voice with stressed syllables, which comprises a phoneme positioning module, an acoustic characteristic extracting module, an acoustic characteristic parameter correcting module, a voice reconstructing module and a characteristic transformation parameter library, wherein the phoneme positioning module is used for determining the position of each phoneme in a received voice signal so as to obtain the time boundary of each word, syllable and phoneme; the acoustic characteristic extracting module is used for extracting the characteristics relevant to stress and the spectrum characteristics from the voice signal; the acoustic characteristic parameter correcting module is used for regulating the input characteristic parameter of voice into a corresponding characteristic parameter for representing the stress of given syllables and carrying out smooth treatment on the regulated characteristic parameter; the voice reconstructing module is used for synthesizing the voice anew by adopting the corrected acoustic characteristic parameter of the voice through a source-filter model; and the characteristic transformation parameter library is used for storing the transformation matrix parameter of the statistics of the acoustic characteristic of each phoneme from non-stress to stress.

Description

technical field [0001] The invention relates to the field of speech signals, in particular to a system and method for automatically generating stressed syllable speech. Background technique [0002] In verbal communication, people usually need to stress some syllables in speech in order to express emphasis or attract attention. In some languages, the difference in stress position can also affect the meaning expressed by words. In addition, the intonation of language mainly depends on the control of stress position and intensity. [0003] At present, computers have been widely used in the field of speech processing. The computer can judge the position of the stressed syllable in the voice through the acoustic characteristics of the voice, and can also generate the voice with the stressed syllable through the speech synthesis technology. Speech synthesis technology can convert a piece of text into corresponding speech through a pre-trained model. The stress position of the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/00G10L15/00G10L15/02G10L13/02

Inventor王欢良邹平

OwnerAISPEECH CO LTD

Method and system for automatically generating voice with stressed syllables

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology