Attention mechanism and convolutional neural network-based voice depression recognition method

A convolutional neural network and recognition method technology, applied in the field of speech depression recognition based on attention mechanism and convolutional neural network, can solve the problems of insufficient representation of speech data and failure to extract speech signal features, so as to improve the accuracy of recognition. rate, improve accuracy, and improve the speed of training

Active Publication Date: 2019-04-09
HANGZHOU DIANZI UNIV
View PDF4 Cites 41 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But the problem is that the above feature extraction process extracts the low-level manually extracted speech features, and does not extract the deeper features in the speech signal, so it cannot fully represent the speech data.
However, the problem is that not all speec

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Attention mechanism and convolutional neural network-based voice depression recognition method
  • Attention mechanism and convolutional neural network-based voice depression recognition method
  • Attention mechanism and convolutional neural network-based voice depression recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

[0053] figure 1 It is a flow chart of the method of the present invention, mainly including five processes: preprocessing of speech data, extracting speech spectrogram, constructing a deep convolutional neural network pre-training model to obtain segment-level features, attention mechanism algorithm to obtain sentence-level features, and SVM model classification output.

[0054] 1. Preprocessing of voice data

[0055] The present invention selects a database AVEC 2017-DSC of speech depression recognition competition (see literature: RingevalF, Schuller B, Valstar M, et al.Summary for AVEC 2017: Real-life Depression and Affect Challenge and Workshop[C] / / ACM on Multimedia Conference. ACM, 2017:1963-1964). The database contains 189 subjects, including 107 training sets, 35 validation sets, and 47 test sets. The process of collecting voice dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an attention mechanism and convolutional neural network-based voice depression recognition method. The method comprises the following steps: firstly, preprocessing voice data,and segmenting relatively long voice data according to a standard that a segmented fragment can fully include depression-related features; then extracting a Mel spectrum diagram of each segmented fragment, and adjusting the size of a spectrum diagram input into a neural network model to facilitate model training; then finely adjusting a weight value by a pre-trained Alexnet deep convolutional neural network, and extracting a superior voice feature in the Mel spectrum diagram; performing weight adjustment on fragment-level voice features by using an attention mechanism algorithm to obtain sentence-level voice features; finally, performing depression classification on the sentence-level voice features with an SVM (support vector machine) classification model. The invention provides a novelmethod of depression recognition based on a voice in consideration of the extraction of the depression-related voice features.

Description

technical field [0001] The invention relates to the fields of speech processing, machine learning and deep learning, in particular to a speech depression recognition method based on an attention mechanism and a convolutional neural network. Background technique [0002] Depression is one of the most common emotional disorders, often manifested in low mood, negative attitude, self-blame and other negative states. Depression not only causes harm to oneself, but also has a great impact on daily life, social work, and interpersonal relationships. But at this stage, the diagnosis of depression still depends on the subjective judgment of doctors, and some evaluation scales are used as auxiliary means. Therefore, it is difficult to diagnose depression accurately, making it difficult for patients with depression to receive basic treatment. How to let the computer automatically analyze and judge the severity of the speaker's depression through the speech signal, that is, speech depr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L25/66G10L25/45G10L25/30G10L25/18G10L15/02G10L15/04G10L15/14
CPCG10L15/02G10L15/04G10L15/14G10L25/18G10L25/30G10L25/45G10L25/66
Inventor 戴国骏商吉利沈方瑶胡焰焰张桦
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products