Speech enhancement model training and application method, device and equipment, equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech enhancement and training method technology, applied in speech analysis, speech synthesis, instruments, etc., can solve the problems of loss, audio or acoustic feature large information, high cost, etc., and achieve the effect of small distortion

Pending Publication Date: 2021-09-24

PING AN TECH (SHENZHEN) CO LTD

View PDF0 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, the recording of high-quality voice data will consume a lot of cost, and if it is recorded in an ordinary indoor environment, the background noise and other environmental noises and reverberation will be collected or even amplified by the recording equipment

If the current mainstream deep neural network method is used for speech enhancement, it will often cause large distortion, and the audio or acoustic features will suffer from large information loss before the speech synthesis model training.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] Next, the technical scheme in the present application will be clear and completely, and the embodiments described herein are described herein, and not all of the embodiments of the present disclosure, not all of the embodiments of the present application. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without making creative labor premises, all of the present application protected.

[0030] The flowchart shown in the drawings is merely illustrative, and it is not necessary to include all content and operation / steps, nor must be performed in the described order. For example, some operations / steps can also be decomposed, combined, or partially combined, so the order actually performed may change according to the actual situation.

[0031] It should be understood that the terms used in this present application specification are merely intended to limit the purposes of the specific embodiments. As us...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the field of artificial intelligence speech enhancement, and particularly discloses a speech enhancement model training and application method, device and equipment, and a storage medium, and a speech enhancement model with small distortion and noise reduction capability is obtained through joint modeling of the speech enhancement model and a vocoder. The method comprises the following steps: performing analog noise addition on clean speech to obtain noisy speech, and determining a target time-frequency mask according to the clean voice and the noisy speech; extracting noisy Mel spectrum features from the noisy speech, inputting the noisy Mel spectrum features into the speech enhancement model, outputting a predicted time-frequency mask, and determining a first loss value according to the predicted time-frequency mask and a target time-frequency mask; obtaining de-noised Mel spectrum features according to the predicted time-frequency mask and the noisy Mel spectrum features; and inputting the de-noised Mel spectrum features into a vocoder to obtain synthetic speech, and determining a second loss value according to the synthetic speech and the clean speech. And optimizing parameters of the speech enhancement model and the vocoder according to the first loss value and the second loss value to obtain a trained speech enhancement model.

Description

Technical field [0001] The present application relates to artificial intelligent speech enhancements, and in particular, a training method, an application, apparatus, computer device, and storage medium, a speech enhancement model. Background technique [0002] Voice synthesis technology has been able to generate a relatively close voice, but to create a high quality speech synthesis system, high quality voice training data is required. High quality speech data typically requires recording in a muffle room equipped with high-end recording equipment and having a very low under noise. So the recording of high quality voice data will cost a lot of cost, and if it is recorded in ordinary indoor environment, the bottom noise and other environmental noise and reverberation will be collected by the recording equipment. If voice enhancement is performed using the current mainstream depth neural network, a larger distortion is often caused to make audio or acoustic characteristics before ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L25/30G10L25/18G10L21/0232G10L13/02

CPCG10L25/18G10L25/30G10L21/0232G10L13/02

Inventor孙奥兰王健宗

OwnerPING AN TECH (SHENZHEN) CO LTD

Speech enhancement model training and application method, device and equipment, equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology