Real-time speech emotion recognition method and device thereof

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for emotion recognition and real-time speech, applied in the field of signal processing, can solve the problems of large amount of calculation, inability to achieve real-time performance, low system generalization ability, etc., and achieve the effect of over-fitting problem and small amount of calculation.

Active Publication Date: 2021-11-05

CHINA UNIV OF GEOSCIENCES (WUHAN)

View PDF12 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The present invention aims to solve the problem that the traditional speech emotion recognition method focuses on feature classification optimization, which results in a large amount of calculation and cannot achieve real-time performance, and some methods understand the emotion through the meaning of sentences and words in the text, resulting in low generalization ability of the system question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0071] refer to figure 1 , the embodiment of the present invention provides a kind of real-time voice emotion recognition method, comprises the following steps:

[0072] Step 1: Extraction of Mel Spectrum

[0073] After pre-emphasizing the original speech signal, use a sliding Hamming window of 25ms (recommended value) with a step size of 15ms for processing, and each sample frame is processed by Fast Fourier Transform (FFT) and Mel filter , the Mel frequency of each sampling frame is obtained by the following conversion formula, and the conversion formula of Mel frequency and Hertz scale frequency is:

[0074]

[0075] where m is the Mel frequency and f is the Hertz scale frequency.

[0076] The center frequency of the Mel filter bank can be expressed as:

[0077]

[0078] where f(l) denotes the center frequency of the Mel filter bank l on the Hertz scale, m l is the lower bound of the Mel filter bank l on the Mel scale, m l+1 is the lower bound of the Mel filter b...

Embodiment 2

[0141] refer to figure 2 , the present invention also provides a real-time voice emotion recognition device, mainly comprising the following modules:

[0142] Mel spectrum extraction module 1, formant extraction module and registration module 2, syllable segmentation and statistics module 3, syllable emotion classification module 4 and sentence-level confidence aggregation module 5;

[0143] In some embodiments, the mel spectrum extraction module 1 specifically includes:

[0144] The pre-emphasis module is used to perform pre-emphasis processing on the original speech signal to obtain a pre-emphasis signal;

[0145] Frame-based windowing and Fourier transform module, used to perform frame-based windowing and Fourier transform processing on the pre-emphasized signal to obtain a transformed signal;

[0146] Mel filtering module, for processing the transformed signal through a Mel filter bank to obtain the Mel frequency of each sampling frame;

[0147] The adjacent frame conn...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a real-time speech emotion recognition method and a device thereof, and the method comprises the steps: extracting formants from a Mel frequency spectrum of a speech signal, carrying out the detection through comparing the maximum values of the local amplitudes of a Mel filter bank, obtaining main formants with the first three amplitudes, and filtering out other formants with unobvious effects through a real-time noise gate, and finally, selecting the formant well matched with the adjacent frame. The maximum value and the minimum value of the formant amplitude are used for separating the syllables, the silent pause during speech is judged through the composite energy of the first three formants in the frame, the syllable segmentation is detected, and the syllable-level emotion recognition method containing 15 manual features is provided at the same time. The real-time accurate recognition of voice emotion is realized.

Description

technical field [0001] The invention relates to the technical field of signal processing, in particular to a real-time speech emotion recognition method and device. Background technique [0002] At present, the detection information applied to the research of human emotion recognition includes voice, facial expression, physiological signal, body language, etc. Speech is the fastest and most natural way to communicate between people. The research on speech emotion recognition is of great significance to promote harmonious human-computer interaction. [0003] Speech emotion recognition technology can be applied in many fields such as medical care, education, and business assistance. In medicine, speech emotion recognition is often used to identify the mental state of patients and assist the disabled to speak; in education, speech emotion recognition can be used to analyze key segments of interest to students, and at the same time detect the emotional state of students while l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L25/15G10L25/24G10L25/30G10L25/63

CPCG10L25/24G10L25/15G10L25/63G10L25/30

Inventor 刘振焘韩梦婷曹卫华黄海彭志昆

Owner CHINA UNIV OF GEOSCIENCES (WUHAN)

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Real-time speech emotion recognition method and device thereof

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology