Method and device for recognizing emotion of voice session, server and storage medium

An emotion recognition and voice technology, applied in voice recognition, voice analysis, instruments, etc., can solve the problems of high manpower consumption, inability to guarantee better model time, long model training time, etc., and achieve the effect of improving optimization efficiency

Active Publication Date: 2018-12-07
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF8 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the existing technology is based on a complete sample data set, which consumes a lot of manpower and takes a long time to train the model
Moreover, the adjustment of model parameters cannot directly and effectively make the model pay special attention to a certain feature, and the efficiency cannot guarantee the time to adjust the model with better effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for recognizing emotion of voice session, server and storage medium
  • Method and device for recognizing emotion of voice session, server and storage medium
  • Method and device for recognizing emotion of voice session, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] figure 1 It is a flow chart of a voice conversation emotion recognition method provided by Embodiment 1 of the present invention. This embodiment is applicable to the situation of recognizing user voice emotion in an intelligent voice conversation scene, and the method can be performed by a voice conversation emotion recognition method device to execute. The method specifically includes the following steps:

[0031] S110. Recognize the conversational voice by using a priori emotion recognition rule to obtain a first recognition result.

[0032] In a specific embodiment of the present invention, emotion is a general term for a series of subjective cognitive experiences, and refers to a user's psychological and physiological state comprehensively generated through various feelings, thoughts and behaviors. Furthermore, emotions reflect the user's mental state during human-computer voice interaction. Correspondingly, in order to provide users with better quality and more ...

Embodiment 2

[0050] On the basis of the above-mentioned first embodiment, this embodiment provides a preferred implementation of the emotion recognition method for voice conversation, which can generate and select currently available prior emotion recognition rules. figure 2 The flow chart of voice conversation emotion recognition based on prior emotion recognition rules provided by Embodiment 2 of the present invention, such as figure 2 As shown, the method includes the following specific steps:

[0051] S210. Perform audio feature extraction on historical conversation speech associated with each preset emotional state.

[0052] In a specific embodiment of the present invention, the historical conversational voice refers to the user's voice generated during the intelligent voice interaction process between the user and the smart product or smart platform, and the historical conversational voice is determined by the emotion recognition result and the emotion recognition The result is co...

Embodiment 3

[0063] On the basis of the above-mentioned first embodiment, this embodiment provides a preferred implementation manner of an emotion recognition method for speech conversation, which can use a neural network to perform emotion recognition on the spectrogram of conversation speech. Figure 4 The flow chart of the speech conversation emotion recognition based on the emotion recognition model provided by Embodiment 3 of the present invention, as Figure 4 As shown, the method includes the following specific steps:

[0064] S410. Generate a conversational spectrogram according to the conversational voice information.

[0065] In the specific embodiment of the present invention, in order to simplify the speech emotion recognition process and improve the accuracy of speech emotion recognition, in view of the fact that the image recognition technology is more mature than the speech recognition technology, this embodiment converts the speech recognition into image recognition, based ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiments of the invention disclose a method and a device for recognizing the emotion of a voice session, a server and a storage medium. The method includes the following steps: recognizing a session voice based on a priori emotion recognition rule to get a first recognition result; recognizing the session voice by using a pre-trained emotion recognition model to get a second recognition result; and obtaining the emotional state of the session voice according to the first recognition result and the second recognition result. According to the embodiments of the invention, by merging the prior knowledge accumulated through a great deal of manual experience and practice and proved to be effective into the recognition of voice emotions, the voice emotion recognition result can be quicklyjudged and interfered with after simple data comparison, the effect of the emotion recognition model can be improved more quickly and clearly, and the optimization efficiency of the emotion recognition model and the speed and accuracy of voice emotion recognition can be improved.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of speech processing, and in particular, to a method, device, server and storage medium for emotion recognition of speech conversations. Background technique [0002] With the rapid development of Internet of Things technology and the widespread promotion of intelligent hardware products, more and more users begin to use voice to communicate with intelligent products, and human-computer intelligent voice interaction has become an important interaction mode in artificial intelligence technology. Therefore, in order to provide users with more humanized services, recognition of user emotions through voice is one of the key problems to be solved by artificial intelligence. [0003] At present, most of the existing technologies use the model training method based on machine learning or deep learning to obtain the speech emotion recognition model, and use the optimization method based on ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/26G10L25/63
CPCG10L15/26G10L25/63
Inventor 陈炳金林英展梁一川凌光周超
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products