Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for converting emotional speech by combining rhythm parameters with tone parameters

An emotion and parameter technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve the problems of inability to change sound quality parameters, immature application of PSOLA, etc., and achieve good results

Inactive Publication Date: 2011-09-14
BEIHANG UNIV
View PDF5 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In the conversion of emotional speech, the application of PSOLA is immature, and it can only modify the prosody parameters of the speech signal, but cannot change the sound quality parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for converting emotional speech by combining rhythm parameters with tone parameters
  • Method for converting emotional speech by combining rhythm parameters with tone parameters
  • Method for converting emotional speech by combining rhythm parameters with tone parameters

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The invention is a new method for converting neutral speech into four kinds of emotional speech.

[0042] The main content of the present invention is: carry out the extraction statistics of feature parameter to the selected BHUDES emotional voice sample, formulate conversion rule, then change the fundamental frequency curve and the formant position of voice according to the rule, complete neutral voice to four kinds of emotional voice (sadness) , Anger, Happiness, and Surprise).

[0043] In order to more clearly illustrate the purpose, technical solutions and advantages of the present invention, the conversion of neutral speech into surprise speech is taken as an example for further detailed description below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0044] For specific implementation, see figure 1 Flowchart,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for converting emotional speech by combining rhythm parameters (fundamental frequency, time length and energy) with a tone parameter (a formant), which mainly comprises the following steps of: 1, carrying out extraction and analysis of feature parameters on a Beihang University emotional speech database (BHUDES) emotional speech sample (containing neutral speech and four types of emotional speech of sadness, anger, happiness and surprise); 2, making an emotional speech conversion rule and defining each conversion constant according to the extracted feature parameters; 3, carrying out extraction of the feature parameters and fundamental tone synchronous tagging on the neutral speech to be converted; 4, setting each conversion constant according to the emotional speech conversion rule in the step 2, modifying a fundamental frequency curve, the time length and the energy and synchronously overlaying fundamental tones to synthesize a speech signal; and 5, carrying out linear predictive coding (LPC) analysis on the speech signal in the step 4 and modifying the formant by a pole of a transfer function so as to finally obtain the emotional speech rich in expressive force.

Description

technical field [0001] The invention relates to the fields of speech signal processing and artificial intelligence, and mainly relates to an emotional speech conversion method combining prosody and sound quality parameters. Background technique [0002] Speech synthesis is an important part of human-computer interaction. Now what people hope to hear is no longer a boring machine voice with high intelligibility, but a human voice that can express emotions. The existing level of speech synthesis still solves the stage from text to speech synthesis, that is, text-to-speech (TTS: Text to Speech), and the emotional information in speech cannot be well expressed. [0003] In addition, emotional speech can also be combined with other multimedia technologies, such as matching emotional speech with corresponding facial features to express emotions and synchronize voice and expression. This is the currently popular "visual speech" technology. [0004] Extracting emotional features f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L15/02G10L25/63
Inventor 毛峡韩林
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products