Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Real-time speech emotion recognition method and device thereof

A technology for emotion recognition and real-time speech, applied in the field of signal processing, can solve the problems of large amount of calculation, inability to achieve real-time performance, low system generalization ability, etc., and achieve the effect of over-fitting problem and small amount of calculation.

Active Publication Date: 2021-11-05
CHINA UNIV OF GEOSCIENCES (WUHAN)
View PDF12 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention aims to solve the problem that the traditional speech emotion recognition method focuses on feature classification optimization, which results in a large amount of calculation and cannot achieve real-time performance, and some methods understand the emotion through the meaning of sentences and words in the text, resulting in low generalization ability of the system question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Real-time speech emotion recognition method and device thereof
  • Real-time speech emotion recognition method and device thereof
  • Real-time speech emotion recognition method and device thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0071] refer to figure 1 , the embodiment of the present invention provides a kind of real-time voice emotion recognition method, comprises the following steps:

[0072] Step 1: Extraction of Mel Spectrum

[0073] After pre-emphasizing the original speech signal, use a sliding Hamming window of 25ms (recommended value) with a step size of 15ms for processing, and each sample frame is processed by Fast Fourier Transform (FFT) and Mel filter , the Mel frequency of each sampling frame is obtained by the following conversion formula, and the conversion formula of Mel frequency and Hertz scale frequency is:

[0074]

[0075] where m is the Mel frequency and f is the Hertz scale frequency.

[0076] The center frequency of the Mel filter bank can be expressed as:

[0077]

[0078] where f(l) denotes the center frequency of the Mel filter bank l on the Hertz scale, m l is the lower bound of the Mel filter bank l on the Mel scale, m l+1 is the lower bound of the Mel filter b...

Embodiment 2

[0141] refer to figure 2 , the present invention also provides a real-time voice emotion recognition device, mainly comprising the following modules:

[0142] Mel spectrum extraction module 1, formant extraction module and registration module 2, syllable segmentation and statistics module 3, syllable emotion classification module 4 and sentence-level confidence aggregation module 5;

[0143] In some embodiments, the mel spectrum extraction module 1 specifically includes:

[0144] The pre-emphasis module is used to perform pre-emphasis processing on the original speech signal to obtain a pre-emphasis signal;

[0145] Frame-based windowing and Fourier transform module, used to perform frame-based windowing and Fourier transform processing on the pre-emphasized signal to obtain a transformed signal;

[0146] Mel filtering module, for processing the transformed signal through a Mel filter bank to obtain the Mel frequency of each sampling frame;

[0147] The adjacent frame conn...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a real-time speech emotion recognition method and a device thereof, and the method comprises the steps: extracting formants from a Mel frequency spectrum of a speech signal, carrying out the detection through comparing the maximum values of the local amplitudes of a Mel filter bank, obtaining main formants with the first three amplitudes, and filtering out other formants with unobvious effects through a real-time noise gate, and finally, selecting the formant well matched with the adjacent frame. The maximum value and the minimum value of the formant amplitude are used for separating the syllables, the silent pause during speech is judged through the composite energy of the first three formants in the frame, the syllable segmentation is detected, and the syllable-level emotion recognition method containing 15 manual features is provided at the same time. The real-time accurate recognition of voice emotion is realized.

Description

technical field [0001] The invention relates to the technical field of signal processing, in particular to a real-time speech emotion recognition method and device. Background technique [0002] At present, the detection information applied to the research of human emotion recognition includes voice, facial expression, physiological signal, body language, etc. Speech is the fastest and most natural way to communicate between people. The research on speech emotion recognition is of great significance to promote harmonious human-computer interaction. [0003] Speech emotion recognition technology can be applied in many fields such as medical care, education, and business assistance. In medicine, speech emotion recognition is often used to identify the mental state of patients and assist the disabled to speak; in education, speech emotion recognition can be used to analyze key segments of interest to students, and at the same time detect the emotional state of students while l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L25/15G10L25/24G10L25/30G10L25/63
CPCG10L25/24G10L25/15G10L25/63G10L25/30
Inventor 刘振焘韩梦婷曹卫华黄海彭志昆
Owner CHINA UNIV OF GEOSCIENCES (WUHAN)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products