Transfer Learning Speech Enhancement Method Based on Self-Attention Multi-kernel Maximum Mean Difference

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of maximum mean difference and speech enhancement, applied in speech analysis, instruments, etc., can solve problems such as model mismatch, and achieve the effect of improving robustness and performance, ingenious and novel methods, and improving feature effectiveness.

Active Publication Date: 2021-02-19

NANJING INST OF TECH

View PDF9 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the present invention is to overcome the existing speech (single-channel) enhancement method, and the problem of model mismatch occurs when the environment changes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0043] The present invention will be further described below in conjunction with the accompanying drawings.

[0044] Such as figure 1 As shown, the transfer learning speech enhancement method based on self-attention multi-core maximum mean difference of the present invention comprises the following steps,

[0045] Step (A), extract (gamma-pass frequency cepstral coefficient) GFCC feature from original speech, and as the input feature of deep neural network;

[0046] Step (B), using the noisy speech and the clean speech information to calculate the ideal floating value mask in the Fourier transform domain, and use it as the training target of the deep neural network;

[0047] Step (C), constructing the speech enhancement model based on deep neural network, as baseline model, described baseline model is 4 layers of DNN speech enhancement models, and the first two layers are feature encoders, and the latter two layers are reconstruction decoders;

[0048] Step (D), according to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a transfer learning speech enhancement method based on self-attention multi-core maximum mean value difference, which includes extracting GFCC features from original speech and using them as input features of a deep neural network; using noisy speech and clean speech information to calculate Fourier Ideal floating-value masking in the transform domain and as a training target for deep neural networks; building a speech enhancement model based on deep neural networks; building a transfer learning speech enhancement model with self-attention multi-core maximum mean difference; training transfer of self-attention multi-core maximum mean difference Learn the speech enhancement model; input the frame-level features of the noisy speech in the target domain, and reconstruct the enhanced speech waveform. The present invention adds a self-attention algorithm to the front end of the multi-core maximum mean difference, by minimizing the multi-core maximum mean difference between the features noticed in the source domain and the features noticed in the target domain, to realize the transfer learning of the unlabeled target domain, and to improve Speech enhancement performance, has a good application prospect.

Description

technical field [0001] The invention relates to the technical field of speech enhancement, in particular to a transfer learning speech enhancement method based on self-attention multi-core maximum mean difference. Background technique [0002] Speech enhancement has important applications in various fields of speech processing. The purpose of speech enhancement is to improve the quality and intelligibility of speech polluted by noise. The focus of early research on single-channel speech enhancement algorithms is how to effectively estimate the noise spectrum from noisy speech and suppress it. Typical algorithms include spectral subtraction, Wiener filter method, minimum mean square error method, minimum controlled iterative average noise estimation algorithm and its improved algorithm, etc. These algorithms mainly study additive background noise and are designed based on the complex statistical properties between noise and pure speech. However, the complex statistical int...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L21/02G10L25/30G10L25/03G10L25/24

CPCG10L21/02G10L25/03G10L25/24G10L25/30

Inventor梁瑞宇程佳鸣梁镇麟谢跃王青云包永强赵力

OwnerNANJING INST OF TECH

Transfer Learning Speech Enhancement Method Based on Self-Attention Multi-kernel Maximum Mean Difference

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology