Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice emotion recognition method and device based on domain confrontation

A speech emotion recognition and field technology, applied in speech analysis, neural learning methods, character and pattern recognition, etc., to achieve the effect of high recognition accuracy and good classification

Active Publication Date: 2020-04-10
SOUTHEAST UNIV
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of the existing methods process speech signals from two perspectives: the frame scale and the entire sentence scale, and few methods consider combining the above two scales

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice emotion recognition method and device based on domain confrontation
  • Voice emotion recognition method and device based on domain confrontation
  • Voice emotion recognition method and device based on domain confrontation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] This embodiment provides a speech emotion recognition method based on domain confrontation, such as figure 1 and figure 2 shown, including:

[0034] (1) Obtain a speech emotion database storing several speech signals and corresponding emotion category labels, and divide it into a source domain database and a target domain database.

[0035] Among them, the method of dividing the source domain database and the target domain database is the Leave-One-Subject-Out Cross Validation method: the voice signal belonging to any person in the voice emotion database and the corresponding emotion category label are used as the target domain database , and the speech signals and corresponding emotion category labels of all others are used as the source domain database.

[0036] (2) For each speech signal in the source domain database and the target domain database, extract its IS10 feature as the global feature of the corresponding speech signal.

[0037] Among them, the IS10 fea...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice emotion recognition method and device based on domain confrontation. The method comprises the steps: (1) obtaining a voice emotion database, and dividing the voice emotion database into a source domain database and a target domain database, (2) for each voice signal, extracting IS10 features as global features, (3) dividing the voice signal into a plurality of shortsegments which are overlapped by 50% forwards and backwards according to time, and extracting IS10 features of each short segment, (4) inputting the IS10 features of all the short segments into a bidirectional long-short time memory model, then inputting them into an attention mechanism model, and outputting the IS10 features as local features, (5) connecting the global features and the local features in series to serve as joint features, (6) establishing a neural network which comprises a domain discriminator and an emotion classifier, (7) training the neural network, wherein the total lossof the network is obtained by subtracting the loss of the domain discriminator from the loss of the sentiment classifier, and (8) obtaining the joint features of the speech signals to be recognized, and inputting the joint features into the trained neural network to obtain a predicted emotion category. The method is more accurate in recognition result.

Description

technical field [0001] The invention relates to speech emotion recognition technology, in particular to a speech emotion recognition method and device based on domain confrontation. Background technique [0002] Speech emotion recognition is a hot research problem in the field of affective computing, with broad application prospects. Since speech signals have unique sequence properties, speech emotion recognition can be viewed as a dynamic or static classification problem. Most existing methods process speech signals from two perspectives: the frame scale and the entire sentence scale, and few methods consider combining the above two scales. The difficulty of speech emotion recognition lies in extracting appropriate speech emotion features and reducing the feature distribution difference between source domain database (training database) data and target domain database (test database) data. Contents of the invention [0003] Purpose of the invention: The present inventio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/63G10L25/30G06K9/62G06N3/08
CPCG10L25/63G10L25/30G06N3/08G06F18/24G06F18/214
Inventor 郑文明郑婉璐宗源路成
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products