Cross-database Speech Emotion Recognition Method and Device Based on Joint Distribution Least Squares Regression

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech emotion recognition and least squares, applied in speech analysis, instruments, etc., to achieve good adaptability and accurate recognition results

Active Publication Date: 2022-06-28

SOUTHEAST UNIV

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, cross-database speech emotion recognition is facing great challenges

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0064] This embodiment provides a cross-database speech emotion recognition method based on joint distribution least squares regression, such as figure 1 shown, including the following steps:

[0065] (1) Acquire two speech databases as training database and test database, wherein the training speech database contains several speech fragments and corresponding speech emotion category labels, while the test database only contains several speech fragments to be recognized.

[0066] In this embodiment, we use three types of speech emotion databases commonly used in emotional speech recognition: Berlin, eNTERFACE and CAISA. Because the three types of databases contain different sentiment categories, the data are selected in the pairwise comparison. When comparing Berlin and eNTERFACE, we selected 375 pieces of data and 1077 pieces of data respectively, and the emotion categories were 5 categories (angry, scared, happy, disgusted, sad); when Berlin and CAISA were compared, we sele...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a cross-database speech emotion recognition method and device based on joint distribution least squares regression. The method includes: (1) acquiring a training database and a test database, wherein the training speech database contains several speech segments and corresponding Speech emotion category labels, the test database only contains a number of speech segments to be recognized; (2) use several acoustic low-dimensional descriptors to process and count the speech segments, and use each information obtained from the statistics as an emotional feature, and Multiple emotional feature vectors are used as the feature vectors of the corresponding speech segments; (3) establish a least squares regression model based on the joint distribution, and use the training database and the test database for joint training to obtain a sparse projection matrix; (4) for the speech segment to be recognized , get the feature vector according to step (2), and use the learned sparse projection matrix to get the corresponding speech emotion category label. The invention can adapt to different environments and has higher accuracy.

Description

technical field [0001] The present invention relates to speech emotion recognition, in particular to a cross-database speech emotion recognition method and device based on joint distribution least squares regression. Background technique [0002] The purpose of speech emotion recognition is to enable the machine to have enough intelligence to extract its emotional state (such as happiness, fear, sadness, etc.) from the speaker's speech. It is an important part of human-computer interaction and has huge research potential and development. prospect. For example, detecting the driver's mental state by combining the driver's voice, facial expressions and behavior information can timely remind the driver to concentrate on avoiding dangerous driving; detecting the interlocutor's voice emotion in the human-computer interaction can make the conversation smoother and take care of the interlocutor's behavior. Psychological, close to cognition; wearable devices can give more timely an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L25/63G10L25/27G10L25/24G10L25/21G10L25/18

CPCG10L25/63G10L25/27G10L25/24G10L25/21G10L25/18

Inventor 宗源江林张佳成郑文明江星洵刘佳腾

Owner SOUTHEAST UNIV

Cross-database Speech Emotion Recognition Method and Device Based on Joint Distribution Least Squares Regression

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology