Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech emotion recognition method integrated with semantics

A technology of speech emotion recognition and semantics, which is applied in the field of speech emotion recognition that integrates semantics, can solve problems such as difficult to obtain, poor processing effect, difficult to accurately recognize the emotion and emotion of speech, and achieve the effect of improving processing capacity

Pending Publication Date: 2022-05-27
ZHEJIANG COLLEGE OF ZHEJIANG UNIV OF TECHOLOGY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a speech emotion recognition method that integrates semantics, which solves the problem that the existing speech emotion recognition method is difficult to accurately recognize the emotion and emotion of speech in complex scenes, and for a single frame with serious mixed noise, the processing effect is relatively low. Poor, technical problems that make it difficult to obtain satisfactory results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech emotion recognition method integrated with semantics
  • Speech emotion recognition method integrated with semantics
  • Speech emotion recognition method integrated with semantics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] Example 1: When collecting discourse data, teachers were mainly the research objects, and high-quality course teaching videos on the website of excellent Chinese teachers were selected as research data. The courses on the website were carefully screened by experts, and finally received good reviews and published on the website for appreciation. and learning, therefore, it is representative of Chinese excellent courses, 16 high school course videos are selected from the website, all videos are conducted in a multimedia environment, and are from the first grade of junior high school, including language, mathematics, politics, biology and chemistry .

[0037] First, it is transcribed into textual material. Then, using the teacher evaluation framework developed by Walsh, segments are segmented and encoded according to the teacher's goal-oriented discourse feature principle. In order to ensure the accuracy of encoding, the segment segments are as small as possible. Accordin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a semantic-fused speech emotion recognition method, and relates to the technical field of speech emotion recognition. The method comprises the following steps: a computer inputs a video into a model based on a hierarchical attention mechanism to learn facial emotion significant features; the computer inputs the audio signal into a convolutional network guided by a frequency domain attention mechanism to learn significant frequency domain emotion feature information; the computer inputs the text word vectors into a semantic emotion feature extraction network to learn significant utterance text emotion features; and the computer fuses the audio emotion features and the semantic emotion features by using a multi-modal attention mechanism. According to the method, through a CNN neural network model framework An-Net fusing semantics and voice, the emotion in the speech and voice of a specific person can be accurately recognized in a complex scene, and a multi-mode emotion recognition model is adopted under the guidance of a hierarchical attention mechanism.

Description

technical field [0001] The invention belongs to the technical field of speech emotion recognition, in particular to a speech emotion recognition method incorporating semantics. Background technique [0002] Speech emotion recognition is one of the research hotspots of new human-computer interaction technology, and has a wide range of application prospects in artificial intelligence. Speech emotion recognition includes the establishment of emotional speech database, speech emotion feature extraction and speech emotion recognition classifier, etc. Recognition is to study the formation and changes of the speaker's emotional state from the perspective of speech signals, so as to make the interaction between computers and humans more intelligent. Features, sound quality features, and fusion features of the above features. [0003] The existing speech emotion recognition methods are difficult to accurately recognize the emotion and emotion of speech in complex scenes, and for a s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/16G10L25/63G06N3/04G06F40/30
CPCG10L15/16G10L25/63G06F40/30G06N3/044G06N3/045
Inventor 陈少辉徐晓刚丁述勇
Owner ZHEJIANG COLLEGE OF ZHEJIANG UNIV OF TECHOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products