Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Recognition method for multi-mode fused song emotion based on deep study

A deep learning and emotion recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of difficult extraction, difficult training and prediction, low accuracy, etc., to achieve the effect of good emotional discrimination

Active Publication Date: 2016-12-14
青岛类认知人工智能有限公司
View PDF7 Cites 29 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The current recognition methods mainly include: the methods for recognizing lyrics text alone, most of which use models such as TF-IDF to recognize the emotion of the text, and most of them need to preprocess the text, and at the same time, the accuracy rate is relatively low in multilingual and multi-category recognition. Low, only consider the text information of the song alone, ignoring the influence of the melody of the song on the category of the song; the method of speech recognition of the song alone mostly only uses prosodic features or overall features based on the spectrum, and the prosody features contain strong emotional features The value is difficult to extract and is greatly affected by noise, while the spectrum-based features perform poorly in some parts that reflect strong emotions alone, and it is difficult to consider the emotional category of the song simply by the melody, which has great limitations for the identification of the emotional type of the song ; In the field of combined multimodal recognition, there are fewer recognition methods for song emotion categories, most of which are for song style recognition. Among the recognition methods for song emotion categories, there are fewer methods for using deep methods to recognize song emotions with multimodality ; In terms of feature model training, using general machine learning methods, there are difficulties in training and prediction in the face of large-dimensional and large-scale data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Recognition method for multi-mode fused song emotion based on deep study
  • Recognition method for multi-mode fused song emotion based on deep study
  • Recognition method for multi-mode fused song emotion based on deep study

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In this embodiment, a deep learning-based multimodal fusion song emotion category recognition method includes the following steps:

[0030] Step 1, collect the lyric text database and the song audio database of the song, the lyric text of each song corresponds to the number of the song audio; the collected songs are classified into emotions; specifically, they are divided into miss and abreact , happy (happy) and sad (sad) these four kinds of emotions, and use 1, 2, 3, 4 to express respectively. The comprehensive emotional features of each video can be represented by a quaternion Y.

[0031] Y = ( E , V T , V V 1 , V V 2 ) - - - ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a recognition method for multi-mode fused song emotion based on deep study. The method is characterized by comprising the following steps: 1) acquiring song lyric text data and audio voice data; 2) extracting text features from the lyric text content and acquiring the lyric text information features; 3) firstly fusing a first voice feature and a second voice feature of the song voice data, thereby acquiring song voice information features; 4) secondarily fusing the lyric text information features with the song voice information features, thereby acquiring a comprehensive information feature of the song; 5) utilizing a depth classifier to train the comprehensive information feature, thereby acquiring a song emotion recognition model and realizing the emotion recognition for the multi-mode fusion of the song through the song emotion recognition model. According to the method provided by the invention, the data information at the two aspects of lyric text information and song voice information of the song is comprehensively combined, so that the accuracy for the judgment on the song emotion state in the man-machine interaction can be increased.

Description

technical field [0001] The invention belongs to the field of natural language processing technology and emotion computing, and specifically relates to a method for identifying song emotion categories based on deep learning and multi-modal fusion. Background technique [0002] Affective computing refers to the ability for machines to recognize and understand human emotions. Text, speech and other information forms used by humans to express emotions contain feature values ​​that can represent emotions. Songs are an important way for humans to express their emotions. By extracting these feature values ​​and using machine learning methods, the machine can learn the emotional information contained in the feature values ​​by itself, so that the machine can judge the emotional type of human songs, and independently understand human songs. emotion recognition. [0003] The current recognition methods mainly include: the methods for recognizing lyrics text alone, most of which use ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/02G10L15/06G10L15/18G10L15/26
CPCG10L15/02G10L15/063G10L15/1807G10L15/26G10L2015/0631G10L2015/0635
Inventor 孙晓陈炜亮任福继
Owner 青岛类认知人工智能有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products