End-to-end emotion recognition method based on Chinese speech OpenSmile and bidirectional LSTM

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of emotion recognition and speech, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of low recognition accuracy, easy omission of some information, errors, etc., and achieve high recognition accuracy

Pending Publication Date: 2021-04-09

SHANGHAI MOTION MAGIC DIGITAL ENTERTAINMENT

View PDF1 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

It must be very difficult to identify this linguistic phenomenon, which itself has great uncertainty

In fact, studies have shown that the recognition rate of human emotions is only about 60%. It is obviously more difficult for machines to recognize emotions that are difficult for humans to judge.

[0004] In the prior art, Chinese patent CN109785863A discloses a speech emotion recognition method based on a deep belief network. In this method, the speech signal features are recognized and classified by a support vector machine, although the speech emotion can be recognized and classified, However, the emotion recognition and classification method in this patent tends to miss some information when processing time-related feature sequences. At the same time, the support vector machine is more biased towards binary classification, so the results of emotion analysis may produce errors, resulting in low recognition accuracy.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0040] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0041] An end-to-end emotion recognition method based on Chinese speech OpenSmile and bidirectional LSTM, the process is as follows figure 1 shown, including:

[0042] Step 1: Obtain the Chinese speech audio to be recognized, and preprocess the audio data, specifically:

[0043] Obtain the collection of Chinese speech audio to be recognized, classify the audio according to the corresponding emotion, add the corresponding digital label, and then divide it ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an end-to-end emotion recognition method based on Chinese speech OpenSmile and bidirectional LSTM. The method comprises the steps: 1, obtaining a to-be-recognized Chinese speech audio, and carrying out the preprocessing of audio data; 2, respectively extracting MFCC audio features of voice audios of the training set and the test set by using OpenSmile; 3, training the bidirectional LSTM network by using the training set; 4, testing the trained bidirectional LSTM network by using the test set, calculating the test accuracy, judging whether the test accuracy is greater than a preset threshold value or not, if so, executing the step 5, and otherwise, returning to the step 3; and step 5, performing emotion recognition on the Chinese voice audio by using the bidirectional LSTM network reaching a preset accuracy threshold. Compared with the prior art, the method has the advantages of being high in recognition precision, supporting multi-person and long and short sentence recognition and the like.

Description

technical field [0001] The invention relates to the technical field of speech-based emotion recognition methods, in particular to an end-to-end emotion recognition method based on Chinese speech OpenSmile and bidirectional LSTM. Background technique [0002] With the development of artificial intelligence technology, computers have become close partners of human beings. It can help us retrieve knowledge, plan cities, predict financial trends, ensure production safety, and even play chess and video games with us. For such an intimate "life partner", we naturally hope that the computer can be knowledgeable, not a cold machine. In order to make computers have emotions, researchers have carried out a lot of research in various aspects such as images, texts, and voices. So far, at least at the perceptual level, machines have been able to distinguish good words and understand good faces. [0003] Compared with speaker recognition and language recognition, speech emotion recognit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/02G10L15/06G10L15/183G10L15/26G10L25/24G10L25/63

CPCG10L15/02G10L15/063G10L15/183G10L15/26G10L25/24G10L25/63

Inventor 吴强季晓枫施恩铭马俊郭翔

Owner SHANGHAI MOTION MAGIC DIGITAL ENTERTAINMENT

End-to-end emotion recognition method based on Chinese speech OpenSmile and bidirectional LSTM

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology