Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese song emotion classification method based on multi-modal fusion

A sentiment classification, multimodal technology, applied in audio data clustering/classification, biological neural network model, special data processing applications, etc., can solve problems such as information loss, and achieve the effect of improving classification performance

Active Publication Date: 2020-01-10
BEIJING UNIV OF TECH
View PDF3 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of the traditional music emotion classification methods focus on the analysis of lyrics or audio, but single-modal data can only obtain part of the characteristics of the object, and there is a certain degree of information loss when only using single-modal data for classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese song emotion classification method based on multi-modal fusion
  • Chinese song emotion classification method based on multi-modal fusion
  • Chinese song emotion classification method based on multi-modal fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0034] Step one, data acquisition.

[0035] The present invention needs to construct a Chinese song data set suitable for multimodal music emotion classification, and the data set includes lyrics, music comments and audio of Chinese songs. The VA model is selected as the basis for music emotion classification, and the VA space is mapped into four discrete categories, namely "+V+A", "-V+A", "-V-A" and "+V-A", such as figure 1 shown. According to the collection of data from these four discrete categories, the construction process of the Chinese song dataset is divided into the following three steps: (1) Collection and emotion labeling of Chinese songs. According to the emotional categories that need to be collected, search for related Chinese songs from major music websites. The final dataset contains 400 Chinese songs with distinct emotional ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese song emotion classification method based on multi-modal fusion. The Chinese song emotion classification method comprises the steps: firstly obtaining a spectrogram from an audio signal, extracting audio low-level features, and then carrying out the audio feature learning based on an LLD-CRNN model, thereby obtaining the audio features of a Chinese song; for lyricsand comment information, firstly constructing a music emotion dictionary, then constructing emotion vectors based on emotion intensity and part-of-speech on the basis of the dictionary, so that textfeatures of Chinese songs are obtained; and finally, performing multi-modal fusion by using a decision fusion method and a feature fusion method to obtain emotion categories of the Chinese songs. TheChinese song emotion classification method is based on an LLD-CRNN music emotion classification model, and the model uses a spectrogram and audio low-level features as an input sequence. The LLD is concentrated in a time domain or a frequency domain, and for the audio signal with associated change of time and frequency characteristics, the spectrogram is a two-dimensional representation of the audio signal in frequency, and loss of information amount is less, so that information complementation of the LLD and the spectrogram can be realized.

Description

technical field [0001] The invention relates to the fields of natural language processing technology, audio signal processing technology and deep learning, in particular to a multimodal fusion-based emotion classification method for Chinese songs. Background technique [0002] With the rapid development of computer network and multimedia technology, more and more multimedia data such as text, image, audio and video have emerged on the Internet. Music is an important part of multimedia data. Facing the explosive growth of the number of music works and the continuous increase of music types, the organization and retrieval of music works have attracted extensive attention from experts and scholars. Music is the carrier of emotion, emotion is the most important semantic information of music, and emotion words are the most commonly used words when retrieving and describing music. Therefore, music classification based on emotion can effectively improve the efficiency of music retr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/65G06F16/683G06N3/04
CPCG06F16/65G06F16/685G06N3/044G06N3/045
Inventor 朱贝贝王洁
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products