Unlock instant, AI-driven research and patent intelligence for your innovation.

Speech emotion recognition method combining CGAN spectrogram denoising and bilateral filtering spectrogram enhancement

A technology of speech emotion recognition and bilateral filtering, applied in speech analysis, instruments, etc., can solve the problems of voice quality and emotional information degradation, and achieve the effect of balancing small details and strong edge enhancement effects

Active Publication Date: 2021-02-05
HANGZHOU DIANZI UNIV
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Speech is often accompanied by various noises in practical applications, and the existence of noise will cause a series of impacts on speech emotion recognition, which will reduce the quality of speech and emotional information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech emotion recognition method combining CGAN spectrogram denoising and bilateral filtering spectrogram enhancement
  • Speech emotion recognition method combining CGAN spectrogram denoising and bilateral filtering spectrogram enhancement
  • Speech emotion recognition method combining CGAN spectrogram denoising and bilateral filtering spectrogram enhancement

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] The technical solutions of the present invention will be further explained below through specific examples.

[0066] Such as figure 1 As shown, the speech emotion recognition method combined with CGAN spectrogram denoising and bilateral filter spectrogram enhancement in the embodiment of the present invention includes the following steps:

[0067] S1. Collect the voice emotion data set, and preprocess the voice emotion data set to obtain the spectrogram data set of the clean voice; also add noise to the voice to obtain the noise-added spectrogram data set after the clean voice is added with noise, That is, the spectrogram data set in the noise environment;

[0068] Specifically, each speech signal in the speech emotion data set is preprocessed by framing and windowing, and then short-time discrete Fourier transform is performed to obtain the spectrum X(k):

[0069]

[0070] Wherein, N is the window length, x (n) is the voice signal, w (n) is the Hamming window func...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speech emotion recognition method combining CGAN spectrogram denoising and bilateral filtering spectrogram enhancement. The speech emotion recognition method comprises the steps: S1, acquiring a clean speech spectrogram and a noise-added speech spectrogram; S2, inputting the clean speech spectrogram and the noise-added speech spectrogram into a conditional generative adversarial network based on a matrix distance for training to obtain a denoising model; S3, de-noising the noise-added speech spectrogram by using the de-noising model, respectively performing two different scales of bilateral filtering to obtain a low-scale filtering graph and a high-scale filtering graph, multiplying the difference between the low-scale filtering graph and the high-scale filteringgraph by an enhancement coefficient, and adding the result to the low-scale filtering graph to obtain a detail-enhanced spectrogram; S4, inputting the detail-enhanced spectrogram into a convolutionalneural network model for classification to obtain a classification model; and S5, processing the spectrogram of the voice to be recognized in the S3, and inputting the obtained detail-enhanced spectrogram into the classification model to obtain a voice emotion classification result. According to the invention, speech emotion recognition is effectively realized.

Description

technical field [0001] The invention belongs to the field of speech recognition, and mainly relates to the field of human-computer interaction, specifically, a speech emotion recognition method combined with CGAN spectrogram denoising and bilateral filter spectrogram enhancement. Background technique [0002] Applying speech emotion recognition to the human-computer interaction system, on the one hand, can make the robot have "emotion" like human beings, detect the other party's emotional changes through hearing, and communicate and interact with humans more naturally and intelligently, making the humanized and natural The interaction mode integrating automation and intelligence endows a new type of human-computer interaction system. On the other hand, the system based on voice emotion can provide more novel development ideas for medical treatment, machinery, education, and service, and will further enrich people's daily life, become a human helper, and help people solve pra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/63G10L21/0208G10L21/0232G10L25/03G10L25/18G10L25/30G10L25/45
CPCG10L25/63G10L25/45G10L25/30G10L25/03G10L21/0208G10L21/0232G10L25/18
Inventor 应娜李怡菲郭春生杨萌杨鹏方昕郭凡
Owner HANGZHOU DIANZI UNIV