Speech enhancement algorithm based on attention mechanism

A speech enhancement and attention technology, applied in speech analysis, biological neural network models, instruments, etc., can solve problems such as limiting model performance, inability to effectively deal with complex noise changes, and model performance impact, and achieve speech noise reduction quality, The effect of good handling mechanisms

Inactive Publication Date: 2019-10-01
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This defect is mainly manifested in that when dealing with noise that is not in the training set, the performance of the model will be greatly affected, and it lacks the adaptive processing ability for real complex noise scenes.
In order to solve this problem, a commonly used method in academia is to add adjacent frames to provide context information when inputting, so that the model can denoise the speech according to the context information, and obtain a speech signal with better speech coherence and fidelity, but This approach is easy to introduce too much irrelevant context information, and when the extended frame is set larger, the performance of the model will be limited due to the introduction of too much irrelevant information
In the document "DynamicNoise Aware Training for Speech Enhancement Based on Deep Neural Networks", by selecting the first few frames of the speech segment as an estimate of the speech noise, it is still unable to effectively deal with the complex changes in the noise in the actual scene

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech enhancement algorithm based on attention mechanism
  • Speech enhancement algorithm based on attention mechanism
  • Speech enhancement algorithm based on attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The present invention will be further described below in conjunction with accompanying drawing:

[0027] This example provides a speech enhancement based on the attention mechanism, and its network structure diagram is as follows figure 1 shown, including the following steps:

[0028] Step 1. At each time step, input the current moment into frame x t and the entire speech frame (x 1 , x 2 ,...,x n ) to calculate the attention to get the feature vector expression c at the current moment t ;

[0029] Use the attention mechanism to calculate the feature vector of the current frame about the entire speech, the calculation formula is as follows:

[0030] e tj =G(x t , x j )

[0031]

[0032]

[0033] where G(.) represents MLP calculation, x t is the current tth frame.

[0034] Step 2. Splice the feature vector at the current moment with the current frame to obtain the current input, and use a standard deep recurrent neural network to encode the current inpu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a speech enhancement algorithm based on an attention mechanism. A neural network speech enhancement model based on the attention mechanism is constructed, and comprises three components of a neural network based on the attention mechanism, a standard deep loop neural network and a time-frequency masking layer. At each time step, the model performs attention mechanism calculation on an incoming frame at current time and speech frames of an entire segmentto obtain feature vector expression corresponding to the current time step. The model input is obtained by splicing thecurrent time step feature vector with the current speech frame, and the current input is encoded by the standard deep loop neural network to obtain a predicted value of time-frequency masking. The predicted value of the time-frequency masking is multiplied by the mixed speech step-by-step to obtain an enhanced speech segment. The algorithm models a speech enhancement problem from the perspectiveof improving the generalization performance of the model, and can effectively solve the speech enhancement problem in a noise scene which does not appear in the training.

Description

technical field [0001] The invention relates to the field of speech signal processing, in particular to a speech enhancement algorithm based on an attention mechanism. Background technique [0002] Speech enhancement is a fundamental problem in the field of speech processing. At present, speech-based human-computer interaction is booming. Under laboratory conditions, algorithms such as speech recognition and speaker recognition already have a high accuracy rate, but in the application of actual scenarios, the existence of noise makes the accuracy of these speech applications Therefore, reducing the interference of noise to speech signals is an urgent problem to be solved. At present, the speech enhancement algorithm based on deep learning has received a lot of attention, produced a lot of valuable work, and attracted the interest of a large number of researchers. [0003] The speech enhancement algorithm based on deep learning is a data-driven method, and the performance o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0264G10L25/30G06N3/02
CPCG06N3/02G10L21/0264G10L25/30
Inventor 蓝天李萌惠国强刘峤吕忆蓝钱雨欣叶文政彭川李森
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products