Method for recognizing emotion points of Chinese pronunciation based on sound-track modulating signals MFCC (Mel Frequency Cepstrum Coefficient)

A Chinese and emotional technology, applied in the information field, can solve the problem of low average recognition rate of emotional points

Inactive Publication Date: 2012-09-05
BEIHANG UNIV
View PDF4 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] By directly extracting features from the voice data, then training, modeling, and recognizing the low average recognition rate o

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for recognizing emotion points of Chinese pronunciation based on sound-track modulating signals MFCC (Mel Frequency Cepstrum Coefficient)
  • Method for recognizing emotion points of Chinese pronunciation based on sound-track modulating signals MFCC (Mel Frequency Cepstrum Coefficient)
  • Method for recognizing emotion points of Chinese pronunciation based on sound-track modulating signals MFCC (Mel Frequency Cepstrum Coefficient)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The technical scheme of the present invention will be further explained below in conjunction with the drawings.

[0049] figure 1 It is a flow chart of using electroglottic diagram signals and speech signals for feature extraction, training models and identifying emotional points. It is mainly divided into two parts: the acquisition of Chinese voice emotional points and the recognition of Chinese voice emotional points.

[0050] 1. To acquire the emotional points of Chinese speech, the method steps are as follows:

[0051] Step 1. Formulate the recording specifications of emotional speech database, the specific rules are as follows;

[0052] (1) Speakers: Between the ages of 20 and 25, with a bachelor's degree, the number is 5 males and 5 females, 10 people.

[0053] (2) Speaking content: 28 interjections are selected as emotional points, and each emotional point is recorded 3 times during the experiment.

[0054] (3) Emotion classification: angry, happy, sad, surprised, fearful, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method capable of increasing the average recognition rate of emotion points. The method comprises the following steps of: making specifications of emotion data of an electroglottography and voice database; collecting the emotion data of the electroglottography and voice data; carrying out subjective evaluation on the collected data, and selecting one set of data subset as a study object; carrying out preprocessing on electroglottography signals and voice signals, and extracting short-time characteristics and corresponding statistical characteristics in the voice signals and MEL frequency cepstrum coefficient SMFCC; carrying out fast Fourier transform on the electroglottography signals and the voice signals, then dividing the electroglottography signals and the voice signals, and after dividing, obtaining MEL frequency cepstrum coefficient TMFCC; and respectively using different characteristic combination for carrying out experiment, and solving the average recognition rate of 28 emotion points under different characteristic combinations when a speaker is related and not related. The experimental result shows that by adoption of TMFCC characteristics, the average recognition rate of the emotion points can be increased.

Description

(1) Technical field: [0001] The invention relates to a method for improving the average recognition rate of emotional points of Chinese speech based on a channel modulation signal MFCC, and belongs to the field of information technology. (2) Background technology: [0002] Speech emotion recognition is to extract the emotional state of the speaker from the speech signal. According to the speech excitation modulation model, the generation of speech signal includes two parts: glottal excitation and vocal tract modulation. Glottal excitation determines the change of speech prosody and plays an important role in speech emotion recognition. The vocal tract modulation mainly determines the content of speech. Each vowel corresponds to a different formant, reflecting different vocal tract shape information. Chinese is a tonal phonetic, and most of the syllables are composed of initials and finals. The syllables composed of the same initials and finals have different meanings and expres...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L17/00G10L15/14G10L25/63
Inventor 毛峡魏鹏飞
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products