Training method and device of speech enhancement model a well as speech enhancement method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech enhancement and training method, which is applied in the audio field and can solve problems such as affecting effects, general effects, and high computational complexity

Active Publication Date: 2021-06-08

BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

View PDF5 Cites 14 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] Noisy environment will affect the effect of people in voice communication. In the current mainstream communication software, different voice enhancement algorithms are usually used to process the noise-containing frequency during the call. The traditional method can realize the processing of steady-state noise. Advantages The computational complexity is low. The deep learning method is usually used to remove transient noise. The effect is better than the traditional method, but the computational complexity is high.

[0003] Noisy speech usually contains background noise or the voices of other speakers. In order to improve communication efficiency, it is necessary to obtain the pure speech of a specific speaker. Conventional speech enhancement can remove background noise and separate the voices of each speaker. But still facing the problem of sorting the speakers, I don’t know which speaker’s voice should be output, so the effect of speech enhancement for specific speakers is general

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0060] In order to enable ordinary persons in the art to better understand the technical solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings.

[0061]It should be noted that the terms "first" and "second" in the specification and claims of the present disclosure and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. The implementations described in the following examples do not represent all implementations consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with aspects...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a training method and device of a speech enhancement model as well as a speech enhancement method and device. The training method comprises the steps that: the feature vectors of noisy speech samples and first pure speech samples of a plurality of speakers are obtained, wherein the noisy voice sample of each speaker is obtained by adding noise data to a second pure speech sample corresponding to the speaker; the amplitude spectra of the noisy speech samples are input into a speech enhancement network to obtain an estimated first mask ratio; the estimated first mask ratio and the feature vector are input into an attention mechanism network to obtain an estimated second mask ratio; an estimated amplitude spectrum is determined according to the estimated second mask ratio and the amplitude spectra, and a loss function of the speech enhancement model is determined according to the estimated amplitude spectrum and the amplitude spectra of the second pure speech samples; and the speech enhancement model is trained by adjusting parameters of the speech enhancement network and the attention mechanism network according to the loss function.

Description

technical field [0001] The present disclosure relates to the field of audio technology, and more specifically, to a method and device for training a speech enhancement model, and a method and device for speech enhancement. Background technique [0002] Noisy environment will affect the effect of people's voice communication. In the current mainstream communication software, different voice enhancement algorithms are usually used to process the noise-containing frequency during the call. The traditional method can realize the processing of steady-state noise. Advantages It is low computational complexity. Deep learning methods are usually used to remove transient noise. The effect is better than traditional methods, but the computational complexity is high. [0003] Noisy speech usually contains background noise or the voices of other speakers. In order to improve communication efficiency, it is necessary to obtain the pure speech of a specific speaker. Conventional speech en...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L21/0208G10L21/0224G10L21/0232G10L21/0272G10L25/24G10L25/30

CPCG10L21/0208G10L21/0224G10L21/0232G10L21/0272G10L25/30G10L25/24G10L2021/02087

Inventor 张新张旭郑羲光张晨郭亮

Owner BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Training method and device of speech enhancement model a well as speech enhancement method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology