Method for converting emotional speech by combining rhythm parameters with tone parameters

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
An emotion and parameter technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve the problems of inability to change sound quality parameters, immature application of PSOLA, etc., and achieve good results

Inactive Publication Date: 2011-09-14

BEIHANG UNIV

View PDF5 Cites 23 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] In the conversion of emotional speech, the application of PSOLA is immature, and it can only modify the prosody parameters of the speech signal, but cannot change the sound quality parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0041] The invention is a new method for converting neutral speech into four kinds of emotional speech.

[0042] The main content of the present invention is: carry out the extraction statistics of feature parameter to the selected BHUDES emotional voice sample, formulate conversion rule, then change the fundamental frequency curve and the formant position of voice according to the rule, complete neutral voice to four kinds of emotional voice (sadness) , Anger, Happiness, and Surprise).

[0043] In order to more clearly illustrate the purpose, technical solutions and advantages of the present invention, the conversion of neutral speech into surprise speech is taken as an example for further detailed description below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0044] For specific implementation, see figure 1 Flowchart,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method for converting emotional speech by combining rhythm parameters (fundamental frequency, time length and energy) with a tone parameter (a formant), which mainly comprises the following steps of: 1, carrying out extraction and analysis of feature parameters on a Beihang University emotional speech database (BHUDES) emotional speech sample (containing neutral speech and four types of emotional speech of sadness, anger, happiness and surprise); 2, making an emotional speech conversion rule and defining each conversion constant according to the extracted feature parameters; 3, carrying out extraction of the feature parameters and fundamental tone synchronous tagging on the neutral speech to be converted; 4, setting each conversion constant according to the emotional speech conversion rule in the step 2, modifying a fundamental frequency curve, the time length and the energy and synchronously overlaying fundamental tones to synthesize a speech signal; and 5, carrying out linear predictive coding (LPC) analysis on the speech signal in the step 4 and modifying the formant by a pole of a transfer function so as to finally obtain the emotional speech rich in expressive force.

Description

technical field [0001] The invention relates to the fields of speech signal processing and artificial intelligence, and mainly relates to an emotional speech conversion method combining prosody and sound quality parameters. Background technique [0002] Speech synthesis is an important part of human-computer interaction. Now what people hope to hear is no longer a boring machine voice with high intelligibility, but a human voice that can express emotions. The existing level of speech synthesis still solves the stage from text to speech synthesis, that is, text-to-speech (TTS: Text to Speech), and the emotional information in speech cannot be well expressed. [0003] In addition, emotional speech can also be combined with other multimedia technologies, such as matching emotional speech with corresponding facial features to express emotions and synchronize voice and expression. This is the currently popular "visual speech" technology. [0004] Extracting emotional features f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/02G10L15/02G10L25/63

Inventor毛峡韩林

OwnerBEIHANG UNIV

Method for converting emotional speech by combining rhythm parameters with tone parameters

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology