Deep neural network (DNN) voice enhancement model based on MEE optimization criteria

A deep neural network and speech enhancement technology, applied in biological neural network models, speech analysis, speech recognition, etc., can solve problems such as unsatisfactory non-stationary noise effect, and achieve the effect of solving noisy speech noise reduction.

Inactive Publication Date: 2018-06-08
CHONGQING UNIV OF POSTS & TELECOMM
View PDF2 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In view of the defect that the MSE criterion is not ideal for non-stationary noise, a speech enhancement model based on a deep neural network is needed, and the MEE optimization criterion is used to replace the traditional MSE criterion

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep neural network (DNN) voice enhancement model based on MEE optimization criteria
  • Deep neural network (DNN) voice enhancement model based on MEE optimization criteria
  • Deep neural network (DNN) voice enhancement model based on MEE optimization criteria

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] Hereinafter, the preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

[0035] Select 4620 pure speech and white noise, pink noise, Volvo noise and car noise from the TIMIT data set to add -5db, noisy speech with 5db signal-to-noise ratio as the training set. Another 200 pure voices are mixed with babble noise and factory noise under the same signal-to-noise ratio as the test set.

[0036] In the training stage, features are extracted from the training set speech, feature selection logarithmic power spectrum, and input to the MSE-DNN network and the MEE-DNN network proposed by the present invention for training.

[0037] After the network training is completed, extract the logarithmic power spectrum of the test set speech as well, and input it into two different DNN networks again to obtain the estimation of the logarithmic power spectrum of the pure speech, and use the overlap-addition method to reconstruct th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a DNN voice enhancement model based on MEE optimization criteria, and belongs to the field of voice enhancement of artificial intelligence. The model comprises an input layer,a hidden layer and an output layer, and the whole training model is divided into a training stage and an enhancement stage. In the training stage, and pure voice is added with different types of noises to establish mixed noise included voices in different signal to noise ratios; and features are extracted from the mixed voices and input to the DNN for training. In the enhancement stage, same features are extracted from the mixed voices to be tested, and input to the trained DNN for decoding, the network outputs estimation of the features of the pure voice, and waveform reconstruction is carried out to obtain a noise-reduced voice file. The model is highly universal for noise reduction of voices including non-smooth noises in practical problems.

Description

Technical field [0001] The invention belongs to the invention belongs to the field of artificial intelligence speech enhancement, and mainly relates to the application of a deep neural network in a speech acoustic model. Background technique [0002] In recent years, with the successful application of Deep Neural Network (DNN) in the field of speech recognition, speech enhancement tasks have also made considerable progress. The deep nonlinear structure of DNN can be designed as a fine noise reduction filter, and based on big data training, DNN can fully learn the complex nonlinear relationship between noisy speech and pure speech. [0003] In the speech enhancement model based on deep neural networks, a cost function is needed to update the network weights. In the regression task of speech enhancement, the minimum mean square error MSE criterion is generally used as the optimization criterion. Its advantage is simple calculation, but it is only suitable for stationary noise such a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/02G10L15/06G10L21/0208G10L25/03G10L25/30G06N3/08
CPCG06N3/084G10L15/02G10L15/063G10L21/0208G10L25/03G10L25/30
Inventor 周翊黄张翼舒晓峰孙旭光
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products